Ghosh–Pratt identity

Ghosh-Pratt identity
Type	Theorem
Field	Mathematical statistics
Statement	Eθ0[vol(C(X))]=∫θ≠θ0Pθ0(θ∈C(X))dθ
First stated by	Jayanta Kumar Ghosh; John Winsor Pratt
First stated in	1961

In mathematical statistics, the Ghosh-Pratt identity is a theorem that establishes a formal relationship between the expected volume of a confidence set and its probability of false coverage. It is a cornerstone of optimal estimation, as it allows the problem of finding the shortest confidence interval to be framed as a problem of maximizing the power of a statistical test.

The identity was independently discovered and published in 1961 by the Indian statistician Jayanta Kumar Ghosh and the American statistician John Winsor Pratt.

Formal statement

Let $X$ be a random variable with a probability distribution indexed by a parameter $θ \in Θ$ . Let $C (X)$ denote a random variable representing a confidence set for $θ$ . The Ghosh-Pratt identity states that the expected volume (or length, if $Θ$ is one dimensional) of the confidence set $C (X)$ , calculated under the true parameter value $θ_{0}$ , is equal to the integral of the probabilities of including false values of $θ$ in the set:

E_{θ_{0}} [vol (C (X))] = \int_{θ \neq θ_{0}} P_{θ_{0}} (θ \in C (X)) d θ

In simpler terms, the expected length of a confidence interval is the sum (integral) of the probabilities of covering all possible incorrect values of $θ$ .^[1]

Derivation

The proof of the identity relies on the relationship between the volume of a set and the indicator function. Given a specific realization of data $x$ , let $C (x)$ be an arbitrary confidence set for the parameter $θ$ . The volume of this set, $vol (C (x))$ , can be expressed as:

vol (C (x)) = \int_{Θ} I (θ \in C (x)) d θ

where $I (\cdot)$ is the indicator function. To find the expected volume under the true parameter $θ_{0}$ , we consider all possible confidence sets across the data generating process $X$ and take the expected value with respect to the distribution of $X$ given $θ_{0}$ :

E_{θ_{0}} [vol (C (X))] = E_{θ_{0}} [\int_{Θ} I (θ \in C (X)) d θ]

By applying Fubini's theorem, we can reverse the order of expectation:

E_{θ_{0}} [vol (C (X))] = \int_{Θ} E_{θ_{0}} [I (θ \in C (X))] d θ

Since the expectation of an indicator function is simply the probability of the event indicated:

E_{θ_{0}} [I (θ \in C (X))] = P_{θ_{0}} (θ \in C (X))

Substituting this false coverage probability back into the integral yields the identity:

E_{θ_{0}} [vol (C (X))] = \int_{Θ} P_{θ_{0}} (θ \in C (X)) d θ

This result shows that the expected volume is the integral of the probability that the confidence set covers any value $θ$ (both the true value $θ_{0}$ as well as all false values $θ \neq θ_{0}$ ).

Generalization

While the identity is often presented in the context of a continuous random variable, it can be generalized using measure theory to cover discrete random variables and mixed distributions as well. If $μ$ is a sigma-finite measure on the parameter space $Θ$ , the expected measure of the confidence set $C (X)$ under the distribution $P_{θ_{0}}$ is given by:

E_{θ_{0}} [μ (C (X))] = \int_{Θ} P_{θ_{0}} (θ \in C (X)) μ (d θ)

This general form demonstrates that the identity is independent of the underlying distribution of the data $X$ : whether it is a continuous random variable or a discrete random variable, the relationship between the expected size of the set and the probability of coverage still holds. In other words, the choice of the measure $μ$ is application-specific:

If $μ$ is the Lebesgue measure, we obtain the standard formula for the expected volume of an interval in $ℝ^{n}$ .
If $μ$ is the counting measure, we obtain the formula for the expected number of points in a discrete set.

This mathematical framework ensures the identity is a robust tool across diverse statistical models.^[2]

Significance and applications

The identity is significant because it connects estimation theory with hypothesis testing. Since a confidence set is often constructed by inverting a family of hypothesis tests, the identity shows that:

to minimize the expected length of a confidence interval, one must minimize the probability of covering false values, and
minimizing the probability of covering false values is equivalent to maximizing the statistical power of the underlying hypothesis test.

By applying the Neyman-Pearson lemma, which identifies uniformly most powerful tests, statisticians can use the Ghosh-Pratt identity to construct confidence intervals that are mathematically guaranteed to be the shortest possible on average.^[3]

History

The identity was published nearly simultaneously in 1961. J.K. Ghosh published his findings in the Calcutta Statistical Association Bulletin, focusing on the relationship between interval types, while John Pratt published his in the Journal of the American Statistical Association focusing on the decision-theoretic implications. While the two approached the problem from slightly different perspectives, their results were mathematically equivalent.^[2]

References

↑ Pratt, John W. (1961). "Length of Confidence Intervals". Journal of the American Statistical Association 56 (295): 549–567. doi:10.1080/01621459.1961.10480644. https://www.tandfonline.com/doi/abs/10.1080/01621459.1961.10480644.
↑ ^2.0 ^2.1 Casella, George (1996). "The Ghosh-Pratt Identity". Cornell University eCommons. https://ecommons.cornell.edu/bitstream/1813/31908/1/BU-1314-M.pdf.
↑ Ghosh, J. K. (1961). "On the relation among shortest confidence intervals of different types". Calcutta Statistical Association Bulletin 10 (4): 147–152. doi:10.1177/0008068319610404. https://doi.org/10.1177/0008068319610404.

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Ghosh–Pratt identity. Read more

[Pratt1961-1] Pratt, John W. (1961). "Length of Confidence Intervals". Journal of the American Statistical Association 56 (295): 549–567. doi:10.1080/01621459.1961.10480644. https://www.tandfonline.com/doi/abs/10.1080/01621459.1961.10480644.

[Cornell-2] 2.0 ^2.1 Casella, George (1996). "The Ghosh-Pratt Identity". Cornell University eCommons. https://ecommons.cornell.edu/bitstream/1813/31908/1/BU-1314-M.pdf.

[Ghosh1961-3] Ghosh, J. K. (1961). "On the relation among shortest confidence intervals of different types". Calcutta Statistical Association Bulletin 10 (4): 147–152. doi:10.1177/0008068319610404. https://doi.org/10.1177/0008068319610404.

[1]

[2]

[3]

Anonymous

Search

Ghosh–Pratt identity

Namespaces

More

Page actions

Contents

Formal statement

Derivation

Generalization

Significance and applications

History

See also

References

Navigation

Navigation

Resources

Help

googletranslator

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Ghosh–Pratt identity

Formal statement

Derivation

Generalization

Significance and applications

History

See also

References

Navigation

Wiki tools

Page tools

Other projects

Categories