Scoring algorithm: Difference between revisions
imported>NBrush update |
Dennis Ross (talk | contribs) fix |
||
| Line 1: | Line 1: | ||
{{Short description|Form of Newton's method used in statistics}} | |||
'''Scoring algorithm''', also known as '''Fisher's scoring''',<ref>{{cite journal |first=Nicholas T. |last=Longford |title=A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects |journal=Biometrika |volume=74 |issue=4 |year=1987 |pages=817–827 |doi=10.1093/biomet/74.4.817 }}</ref> is a form of [[Newton's method]] used in [[Statistics|statistics]] to solve maximum likelihood equations [[Numerical analysis|numerically]], named after [[Biography:Ronald Fisher|Ronald Fisher]]. | '''Scoring algorithm''', also known as '''Fisher's scoring''',<ref>{{cite journal |first=Nicholas T. |last=Longford |title=A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects |journal=Biometrika |volume=74 |issue=4 |year=1987 |pages=817–827 |doi=10.1093/biomet/74.4.817 }}</ref> is a form of [[Newton's method]] used in [[Statistics|statistics]] to solve maximum likelihood equations [[Numerical analysis|numerically]], named after [[Biography:Ronald Fisher|Ronald Fisher]]. | ||
==Sketch of derivation== | ==Sketch of derivation== | ||
Let <math>Y_1,\ldots,Y_n</math> be [[Random variable|random variable]]s, independent and identically distributed with twice differentiable [[Physics:Probability density function|p.d.f.]] <math>f(y; \theta)</math>, and we wish to calculate the maximum likelihood estimator (M.L.E.) <math>\theta^*</math> of <math>\theta</math>. First, suppose we have a starting point for our algorithm <math>\theta_0</math>, and consider a [[Taylor series|Taylor expansion]] of the [[Score (statistics)|score function]], <math>V(\theta)</math>, about <math>\theta_0</math>: | Let <math>Y_1,\ldots,Y_n</math> be [[Random variable|random variable]]s, independent and identically distributed with twice differentiable [[Physics:Probability density function|p.d.f.]] <math>f(y; \theta)</math>, and we wish to calculate the maximum likelihood estimator (M.L.E.) <math>\theta^*</math> of <math>\theta</math>. First, suppose we have a starting point for our algorithm <math>\theta_0</math>, and consider a [[Taylor series|Taylor expansion]] of the [[Score (statistics)|score function]], <math>V(\theta)</math>, about <math>\theta_0</math>: | ||
| Line 25: | Line 27: | ||
: <math>\theta_{m+1} = \theta_{m} + \mathcal{I}^{-1}(\theta_{m})V(\theta_{m})</math>.. | : <math>\theta_{m+1} = \theta_{m} + \mathcal{I}^{-1}(\theta_{m})V(\theta_{m})</math>.. | ||
Under some regularity conditions, if <math>\theta_m</math> is a [[Consistent estimator|consistent estimator]], then <math>\theta_{m+1}</math> (the correction after a single step) is 'optimal' in the sense that its error distribution is asymptotically identical to that of the true max-likelihood estimate.<ref>{{Citation |last1=Li |first1=Bing |title=Bayesian Inference |date=2019 |url=http://dx.doi.org/10.1007/978-1-4939-9761-9_6 |work=Springer Texts in Statistics |place=New York, NY |publisher=Springer New York |isbn=978-1-4939-9759-6 |access-date=2023-01-03 |last2=Babu |first2=G. Jogesh |doi=10.1007/978-1-4939-9761-9_6 |s2cid=239322258 |at=Theorem 9.4}}</ref> | Under some regularity conditions, if <math>\theta_m</math> is a [[Consistent estimator|consistent estimator]], then <math>\theta_{m+1}</math> (the correction after a single step) is 'optimal' in the sense that its error distribution is asymptotically identical to that of the true max-likelihood estimate.<ref>{{Citation |last1=Li |first1=Bing |title=Bayesian Inference |date=2019 |url=http://dx.doi.org/10.1007/978-1-4939-9761-9_6 |work=Springer Texts in Statistics |place=New York, NY |publisher=Springer New York |isbn=978-1-4939-9759-6 |access-date=2023-01-03 |last2=Babu |first2=G. Jogesh |doi=10.1007/978-1-4939-9761-9_6 |s2cid=239322258 |at=Theorem 9.4|url-access=subscription }}</ref> | ||
==See also== | ==See also== | ||
| Line 36: | Line 38: | ||
==Further reading== | ==Further reading== | ||
*{{cite journal |last1=Jennrich |first1=R. I. |last2=Sampson |first2=P. F. |name-list-style=amp |year=1976 |title=Newton-Raphson and Related Algorithms for Maximum Likelihood Variance Component Estimation |journal=[[Technometrics]] |volume=18 |issue=1 |pages=11–17 |doi=10.1080/00401706.1976.10489395 |doi-broken-date= | *{{cite journal |last1=Jennrich |first1=R. I. |last2=Sampson |first2=P. F. |name-list-style=amp |year=1976 |title=Newton-Raphson and Related Algorithms for Maximum Likelihood Variance Component Estimation |journal=[[Technometrics]] |volume=18 |issue=1 |pages=11–17 |doi=10.1080/00401706.1976.10489395 |doi-broken-date=12 July 2025 |jstor=1267911 | url=https://www.tandfonline.com/doi/abs/10.1080/00401706.1976.10489395 |doi-access=free }} | ||
{{Optimization algorithms|unconstrained}} | {{Optimization algorithms|unconstrained}} | ||
Latest revision as of 09:01, 15 April 2026
Scoring algorithm, also known as Fisher's scoring,[1] is a form of Newton's method used in statistics to solve maximum likelihood equations numerically, named after Ronald Fisher.
Sketch of derivation
Let be random variables, independent and identically distributed with twice differentiable p.d.f. , and we wish to calculate the maximum likelihood estimator (M.L.E.) of . First, suppose we have a starting point for our algorithm , and consider a Taylor expansion of the score function, , about :
where
is the observed information matrix at . Now, setting , using that and rearranging gives us:
We therefore use the algorithm
and under certain regularity conditions, it can be shown that .
Fisher scoring
In practice, is usually replaced by , the Fisher information, thus giving us the Fisher Scoring Algorithm:
- ..
Under some regularity conditions, if is a consistent estimator, then (the correction after a single step) is 'optimal' in the sense that its error distribution is asymptotically identical to that of the true max-likelihood estimate.[2]
See also
References
- ↑ Longford, Nicholas T. (1987). "A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects". Biometrika 74 (4): 817–827. doi:10.1093/biomet/74.4.817.
- ↑ Li, Bing; Babu, G. Jogesh (2019), "Bayesian Inference", Springer Texts in Statistics (New York, NY: Springer New York): Theorem 9.4, doi:10.1007/978-1-4939-9761-9_6, ISBN 978-1-4939-9759-6, http://dx.doi.org/10.1007/978-1-4939-9761-9_6, retrieved 2023-01-03
Further reading
- Jennrich, R. I.; Sampson, P. F. (1976). "Newton-Raphson and Related Algorithms for Maximum Likelihood Variance Component Estimation". Technometrics 18 (1): 11–17. doi:10.1080/00401706.1976.10489395. https://www.tandfonline.com/doi/abs/10.1080/00401706.1976.10489395.

