Parallel analysis: Difference between revisions
linkage |
John Marlo (talk | contribs) fix |
||
| Line 1: | Line 1: | ||
{{Short description|Statistical method}} | {{Short description|Statistical method}} | ||
'''Parallel analysis''', also known as '''Horn's parallel analysis''', is a statistical method used to determine the number of components to keep in a [[Principal component analysis|principal component analysis]] or factors to keep in an [[Exploratory factor analysis|exploratory factor analysis]]. It is named after psychologist John L. Horn, who created the method, publishing it in the journal ''Psychometrika'' in 1965.<ref>{{cite journal |last1=Horn |first1=John L. |title=A rationale and test for the number of factors in factor analysis |journal=Psychometrika |date=June 1965 |volume=30 |issue=2 |pages=179–185 |doi=10.1007/bf02289447 |pmid=14306381|s2cid=19663974 }}</ref> The method compares the eigenvalues generated from the data matrix to the eigenvalues generated from a Monte-Carlo simulated matrix created from random data of the same size.<ref name="Allen2017">{{cite book|author=Mike Allen|title=The SAGE Encyclopedia of Communication Research Methods|url=https://books.google.com/books?id=4GFCDgAAQBAJ&pg=PA518|date=11 April 2017|publisher=SAGE Publications|isbn=978-1-4833-8142-8|pages=518}}</ref> | '''Parallel analysis''', also known as '''Horn's parallel analysis''', is a statistical method used to determine the number of components to keep in a [[Principal component analysis|principal component analysis]] or factors to keep in an [[Exploratory factor analysis|exploratory factor analysis]]. It is named after psychologist John L. Horn, who created the method, publishing it in the journal ''Psychometrika'' in 1965.<ref name="Horn1965">{{cite journal |last1=Horn |first1=John L. |title=A rationale and test for the number of factors in factor analysis |journal=Psychometrika |date=June 1965 |volume=30 |issue=2 |pages=179–185 |doi=10.1007/bf02289447 |pmid=14306381|s2cid=19663974 }}</ref> The method compares the eigenvalues generated from the data matrix to the eigenvalues generated from a Monte-Carlo simulated matrix created from random data of the same size.<ref name="Allen2017">{{cite book|author=Mike Allen|title=The SAGE Encyclopedia of Communication Research Methods|url=https://books.google.com/books?id=4GFCDgAAQBAJ&pg=PA518|date=11 April 2017|publisher=SAGE Publications|isbn=978-1-4833-8142-8|pages=518}}</ref> Horn developed the method as a correction to the [[Exploratory factor analysis#Kaiser criterion|Kaiser criterion]], arguing that sampling error and least-squares capitalization inflate eigenvalues in sample data, causing the Kaiser rule to overestimate the number of factors to retain.<ref name="Horn1965" /> | ||
==Evaluation and comparison with alternatives== | ==Evaluation and comparison with alternatives== | ||
Parallel analysis is regarded as one of the more accurate methods for determining the number of factors or components to retain. In particular, unlike early approaches to dimensionality estimation (such as examining scree plots), parallel analysis has the virtue of an objective decision criterion.<ref name="Zwick1986">{{cite journal |last1=Zwick |first1=William R. |last2=Velicer |first2=Wayne F. |title=Comparison of five rules for determining the number of components to retain. |journal=Psychological Bulletin |date=1986 |volume=99 |issue=3 |pages=432–442 |doi=10.1037/0033-2909.99.3.432}}</ref> Since its original publication, multiple variations of parallel analysis have been proposed.<ref name="Glorfeld2016">{{cite journal |last1=Glorfeld |first1=Louis W. |title=An Improvement on Horn's Parallel Analysis Methodology for Selecting the Correct Number of Factors to Retain |journal=Educational and Psychological Measurement |date=2 July 2016 |volume=55 |issue=3 |pages=377–393 |doi=10.1177/0013164495055003002|s2cid=123508406 }}</ref><ref name="Crawford2010">{{cite journal |last1=Crawford |first1=Aaron V. |last2=Green |first2=Samuel B. |last3=Levy |first3=Roy |last4=Lo |first4=Wen-Juo |last5=Scott |first5=Lietta |last6=Svetina |first6=Dubravka |last7=Thompson |first7=Marilyn S. |title=Evaluation of Parallel Analysis Methods for Determining the Number of Factors |journal=Educational and Psychological Measurement |date=September 2010 |volume=70 |issue=6 |pages=885–901 |doi=10.1177/0013164410379332|s2cid=63269411 }}</ref> Other methods of determining the number of factors or components to retain in an analysis include the [[Scree plot|scree plot]], Kaiser rule, or Velicer's MAP test.<ref name=Velicer> | Parallel analysis is regarded as one of the more accurate methods for determining the number of factors or components to retain. In particular, unlike early approaches to dimensionality estimation (such as examining scree plots), parallel analysis has the virtue of an objective decision criterion.<ref name="Zwick1986">{{cite journal |last1=Zwick |first1=William R. |last2=Velicer |first2=Wayne F. |title=Comparison of five rules for determining the number of components to retain. |journal=Psychological Bulletin |date=1986 |volume=99 |issue=3 |pages=432–442 |doi=10.1037/0033-2909.99.3.432}}</ref> In a Monte Carlo simulation study across 96 conditions with known component structure, Zwick and Velicer found that parallel analysis and Velicer's MAP test<ref name=Velicer>{{cite journal| last=Velicer| first=W.F.| title=Determining the number of components from the matrix of partial correlations| journal=Psychometrika| year=1976| volume=41| issue=3| pages=321–327| doi=10.1007/bf02293557| s2cid=122907389}}</ref> generally outperformed the other methods tested, while the Kaiser rule tended to severely overestimate the number of components.<ref name="Zwick1986"></ref> | ||
Since its original publication, multiple variations of parallel analysis have been proposed.<ref name="Glorfeld2016">{{cite journal |last1=Glorfeld |first1=Louis W. |title=An Improvement on Horn's Parallel Analysis Methodology for Selecting the Correct Number of Factors to Retain |journal=Educational and Psychological Measurement |date=2 July 2016 |volume=55 |issue=3 |pages=377–393 |doi=10.1177/0013164495055003002|s2cid=123508406 }}</ref><ref name="Crawford2010">{{cite journal |last1=Crawford |first1=Aaron V. |last2=Green |first2=Samuel B. |last3=Levy |first3=Roy |last4=Lo |first4=Wen-Juo |last5=Scott |first5=Lietta |last6=Svetina |first6=Dubravka |last7=Thompson |first7=Marilyn S. |title=Evaluation of Parallel Analysis Methods for Determining the Number of Factors |journal=Educational and Psychological Measurement |date=September 2010 |volume=70 |issue=6 |pages=885–901 |doi=10.1177/0013164410379332|s2cid=63269411 }}</ref> Other methods of determining the number of factors or components to retain in an analysis include the [[Scree plot|scree plot]], Kaiser rule, or Velicer's MAP test.<ref name=Velicer></ref> | |||
Anton Formann provided both theoretical and empirical evidence that parallel analysis's application might not be appropriate in many cases since its performance is influenced by sample size, [[Item response theory#The item response function|item discrimination]], and type of [[Correlation coefficient|correlation coefficient]].<ref>{{cite journal | last1 = Tran | first1 = U. S. | last2 = Formann | first2 = A. K. | year = 2009 | title = Performance of parallel analysis in retrieving unidimensionality in the presence of binary data | journal = Educational and Psychological Measurement | volume = 69 | pages = 50–61 | doi = 10.1177/0013164408318761 | s2cid = 143051337 }}</ref> | Anton Formann provided both theoretical and empirical evidence that parallel analysis's application might not be appropriate in many cases since its performance is influenced by sample size, [[Item response theory#The item response function|item discrimination]], and type of [[Correlation coefficient|correlation coefficient]].<ref>{{cite journal | last1 = Tran | first1 = U. S. | last2 = Formann | first2 = A. K. | year = 2009 | title = Performance of parallel analysis in retrieving unidimensionality in the presence of binary data | journal = Educational and Psychological Measurement | volume = 69 | pages = 50–61 | doi = 10.1177/0013164408318761 | s2cid = 143051337 }}</ref> | ||
Latest revision as of 04:07, 16 April 2026
Parallel analysis, also known as Horn's parallel analysis, is a statistical method used to determine the number of components to keep in a principal component analysis or factors to keep in an exploratory factor analysis. It is named after psychologist John L. Horn, who created the method, publishing it in the journal Psychometrika in 1965.[1] The method compares the eigenvalues generated from the data matrix to the eigenvalues generated from a Monte-Carlo simulated matrix created from random data of the same size.[2] Horn developed the method as a correction to the Kaiser criterion, arguing that sampling error and least-squares capitalization inflate eigenvalues in sample data, causing the Kaiser rule to overestimate the number of factors to retain.[1]
Evaluation and comparison with alternatives
Parallel analysis is regarded as one of the more accurate methods for determining the number of factors or components to retain. In particular, unlike early approaches to dimensionality estimation (such as examining scree plots), parallel analysis has the virtue of an objective decision criterion.[3] In a Monte Carlo simulation study across 96 conditions with known component structure, Zwick and Velicer found that parallel analysis and Velicer's MAP test[4] generally outperformed the other methods tested, while the Kaiser rule tended to severely overestimate the number of components.[3]
Since its original publication, multiple variations of parallel analysis have been proposed.[5][6] Other methods of determining the number of factors or components to retain in an analysis include the scree plot, Kaiser rule, or Velicer's MAP test.[4]
Anton Formann provided both theoretical and empirical evidence that parallel analysis's application might not be appropriate in many cases since its performance is influenced by sample size, item discrimination, and type of correlation coefficient.[7]
An extensive 2022 simulation study by Haslbeck and van Bork[8] found that parallel analysis was among the best-performing existing methods, but was slightly outperformed by their proposed prediction error-based approach.
Implementation
Parallel analysis has been implemented in JASP, SPSS, SAS, STATA, and MATLAB[9][10][11] and in multiple packages for the R programming language, including the psych[12][13] multicon,[14] hornpa,[15] and paran packages.[16][17] Parallel analysis can also be conducted in Mplus version 8.0 and forward.[18]
See also
- Scree plot
- Exploratory factor analysis § Selecting the appropriate number of factors
- Marchenko-Pastur distribution
References
- ↑ 1.0 1.1 Horn, John L. (June 1965). "A rationale and test for the number of factors in factor analysis". Psychometrika 30 (2): 179–185. doi:10.1007/bf02289447. PMID 14306381.
- ↑ Mike Allen (11 April 2017). The SAGE Encyclopedia of Communication Research Methods. SAGE Publications. pp. 518. ISBN 978-1-4833-8142-8. https://books.google.com/books?id=4GFCDgAAQBAJ&pg=PA518.
- ↑ 3.0 3.1 Zwick, William R.; Velicer, Wayne F. (1986). "Comparison of five rules for determining the number of components to retain.". Psychological Bulletin 99 (3): 432–442. doi:10.1037/0033-2909.99.3.432.
- ↑ 4.0 4.1 Velicer, W.F. (1976). "Determining the number of components from the matrix of partial correlations". Psychometrika 41 (3): 321–327. doi:10.1007/bf02293557.
- ↑ Glorfeld, Louis W. (2 July 2016). "An Improvement on Horn's Parallel Analysis Methodology for Selecting the Correct Number of Factors to Retain". Educational and Psychological Measurement 55 (3): 377–393. doi:10.1177/0013164495055003002.
- ↑ Crawford, Aaron V.; Green, Samuel B.; Levy, Roy; Lo, Wen-Juo; Scott, Lietta; Svetina, Dubravka; Thompson, Marilyn S. (September 2010). "Evaluation of Parallel Analysis Methods for Determining the Number of Factors". Educational and Psychological Measurement 70 (6): 885–901. doi:10.1177/0013164410379332.
- ↑ Tran, U. S.; Formann, A. K. (2009). "Performance of parallel analysis in retrieving unidimensionality in the presence of binary data". Educational and Psychological Measurement 69: 50–61. doi:10.1177/0013164408318761.
- ↑ Haslbeck, Jonas M. B.; van Bork, Riet (February 2024). "Estimating the number of factors in exploratory factor analysis via out-of-sample prediction errors." (in en). Psychological Methods 29 (1): 48–64. doi:10.1037/met0000528. ISSN 1939-1463. https://doi.apa.org/doi/10.1037/met0000528.
- ↑ Hayton, James C.; Allen, David G.; Scarpello, Vida (29 June 2016). "Factor Retention Decisions in Exploratory Factor Analysis: a Tutorial on Parallel Analysis". Organizational Research Methods 7 (2): 191–205. doi:10.1177/1094428104263675.
- ↑ O'Connor, Brian. "Programs for Number of Components and Factors". https://people.ok.ubc.ca/brioconn/nfactors/nfactors.html.
- ↑ O’connor, Brian P. (September 2000). "SPSS and SAS programs for determining the number of components using parallel analysis and Velicer's MAP test". Behavior Research Methods, Instruments, & Computers 32 (3): 396–402. doi:10.3758/BF03200807. PMID 11029811.
- ↑ Revelle, William (2007). Determining the number of factors: the example of the NEO-PI-R. http://www.personality-project.org/r/book/numberoffactors.pdf.
- ↑ Revelle, William (8 January 2020). "psych: Procedures for Psychological, Psychometric, and PersonalityResearch". https://cran.r-project.org/web/packages/psych/.
- ↑ Sherman, Ryne A. (2 February 2015). "multicon: Multivariate Constructs". https://cran.r-project.org/web/packages/multicon/index.html.
- ↑ Huang, Francis (3 March 2015). "hornpa: Horn's (1965) Test to Determine the Number of Components/Factors". https://cran.r-project.org/web/packages/hornpa/index.html.
- ↑ Dinno, Alexis. Gently Clarifying the Application of Horn's Parallel Analysis to Principal Component Analysis Versus Factor Analysis. https://alexisdinno.com/Software/files/PA_for_PCA_vs_FA.pdf.
- ↑ Dinno, Alexis (14 October 2018). paran: Horn's Test of Principal Components/Factors. https://cran.r-project.org/web/packages/paran/.
- ↑ https://www.statmodel.com/HTML_UG/chapter16V8.htm
