Ghosh–Pratt identity
| Type | Theorem |
|---|---|
| Field | Mathematical statistics |
| Statement | |
| First stated by | Jayanta Kumar Ghosh John Winsor Pratt |
| First stated in | 1961 |
In mathematical statistics, the Ghosh-Pratt identity is a theorem that establishes a formal relationship between the expected volume of a confidence set and its probability of false coverage. It is a cornerstone of optimal estimation, as it allows the problem of finding the shortest confidence interval to be framed as a problem of maximizing the power of a statistical test.
The identity was independently discovered and published in 1961 by the Indian statistician Jayanta Kumar Ghosh and the American statistician John Winsor Pratt.
Formal statement
Let be a random variable with a probability distribution indexed by a parameter . Let denote a random variable representing a confidence set for . The Ghosh-Pratt identity states that the expected volume (or length, if is one dimensional) of the confidence set , calculated under the true parameter value , is equal to the integral of the probabilities of including false values of in the set:
In simpler terms, the expected length of a confidence interval is the sum (integral) of the probabilities of covering all possible incorrect values of .[1]
Derivation
The proof of the identity relies on the relationship between the volume of a set and the indicator function. Given a specific realization of data , let be an arbitrary confidence set for the parameter . The volume of this set, , can be expressed as:
where is the indicator function. To find the expected volume under the true parameter , we consider all possible confidence sets across the data generating process and take the expected value with respect to the distribution of given :
By applying Fubini's theorem, we can reverse the order of expectation:
Since the expectation of an indicator function is simply the probability of the event indicated:
Substituting this false coverage probability back into the integral yields the identity:
This result shows that the expected volume is the integral of the probability that the confidence set covers any value (both the true value as well as all false values ).
Generalization
While the identity is often presented in the context of a continuous random variable, it can be generalized using measure theory to cover discrete random variables and mixed distributions as well. If is a sigma-finite measure on the parameter space , the expected measure of the confidence set under the distribution is given by:
This general form demonstrates that the identity is independent of the underlying distribution of the data : whether it is a continuous random variable or a discrete random variable, the relationship between the expected size of the set and the probability of coverage still holds. In other words, the choice of the measure is application-specific:
- If is the Lebesgue measure, we obtain the standard formula for the expected volume of an interval in .
- If is the counting measure, we obtain the formula for the expected number of points in a discrete set.
This mathematical framework ensures the identity is a robust tool across diverse statistical models.[2]
Significance and applications
The identity is significant because it connects estimation theory with hypothesis testing. Since a confidence set is often constructed by inverting a family of hypothesis tests, the identity shows that:
- to minimize the expected length of a confidence interval, one must minimize the probability of covering false values, and
- minimizing the probability of covering false values is equivalent to maximizing the statistical power of the underlying hypothesis test.
By applying the Neyman-Pearson lemma, which identifies uniformly most powerful tests, statisticians can use the Ghosh-Pratt identity to construct confidence intervals that are mathematically guaranteed to be the shortest possible on average.[3]
History
The identity was published nearly simultaneously in 1961. J.K. Ghosh published his findings in the Calcutta Statistical Association Bulletin, focusing on the relationship between interval types, while John Pratt published his in the Journal of the American Statistical Association focusing on the decision-theoretic implications. While the two approached the problem from slightly different perspectives, their results were mathematically equivalent.[2]
See also
References
- ↑ Pratt, John W. (1961). "Length of Confidence Intervals". Journal of the American Statistical Association 56 (295): 549–567. doi:10.1080/01621459.1961.10480644. https://www.tandfonline.com/doi/abs/10.1080/01621459.1961.10480644.
- ↑ 2.0 2.1 Casella, George (1996). "The Ghosh-Pratt Identity". Cornell University eCommons. https://ecommons.cornell.edu/bitstream/1813/31908/1/BU-1314-M.pdf.
- ↑ Ghosh, J. K. (1961). "On the relation among shortest confidence intervals of different types". Calcutta Statistical Association Bulletin 10 (4): 147–152. doi:10.1177/0008068319610404. https://doi.org/10.1177/0008068319610404.
