Study heterogeneity

From HandWiki
Short description: Research study variability considered during meta-analytic, systematic reviews

In statistics, (between-) study heterogeneity is a phenomenon that commonly occurs when attempting to undertake a meta-analysis. In a simplistic scenario, studies whose results are to be combined in the meta-analysis would all be undertaken in the same way and to the same experimental protocols. Differences between outcomes would only be due to measurement error (and studies would hence be homogeneous). Study heterogeneity denotes the variability in outcomes that goes beyond what would be expected (or could be explained) due to measurement error alone.[1]

Introduction

Meta-analysis is a method used to combine the results of different trials in order to obtain a quantitative synthesis. The size of individual clinical trials is often too small to detect treatment effects reliably. Meta-analysis increases the power of statistical analyses by pooling the results of all available trials.

As one tries to use meta-analysis to estimate a combined effect from a group of similar studies, the effects found in the individual studies need to be similar enough that one can be confident that a combined estimate will be a meaningful description of the set of studies. However, the individual estimates of treatment effect will vary by chance; some variation is expected due to observational error. Any excess variation (whether it is apparent or detectable or not) is called (statistical) heterogeneity.[2] The presence of some heterogeneity is not unusual, e.g., analogous effects are also commonly encountered even within studies, in multicenter trials (between-center heterogeneity).

Reasons for the additional variability are usually differences in the studies themselves, the investigated populations, treatment schedules, endpoint definitions, or other circumstances ("clinical diversity"), or the way data were analyzed, what models were employed, or whether estimates have been adjusted in some way ("methodological diversity").[1] Different types of effect measures (e.g., odds ratio vs. relative risk) may also be more or less susceptible to heterogeneity.[3]

Modeling

In case the origin of heterogeneity can be identified and may be attributed to certain study features, the analysis may be stratified (by considering subgroups of studies, which would then hopefully be more homogeneous), or by extending the analysis to a meta-regression, accounting for (continuous or categorical) moderator variables. Unfortunately, literature-based meta-analysis may often not allow for gathering data on all (potentially) relevant moderators.[4]

In addition, heterogeneity is usually accommodated by using a random effects model, in which the heterogeneity then constitutes a variance component.[5] The model represents the lack of knowledge about why treatment effects may differ by treating the (potential) differences as unknowns. The centre of this symmetric distribution describes the average of the effects, while its width describes the degree of heterogeneity. The obvious and conventional choice of distribution is a normal distribution. It is difficult to establish the validity of any distributional assumption, and this is a common criticism of random effects meta-analyses. However, variations of the exact distributional form may not make much of a difference,[6] and simulations have shown that methods are relatively robust even under extreme distributional assumptions, both in estimating heterogeneity,[7] and calculating an overall effect size.[8]

Inclusion of a random effect to the model has the effect of making the inferences (in a sense) more conservative or cautious, as a (non-zero) heterogeneity will lead to greater uncertainty (and avoid overconfidence) in the estimation of overall effects. In the special case of a zero heterogeneity variance, the random-effects model again reduces to the special case of the common-effect model.[9]

Common meta-analysis models, however, should of course not be applied blindly or naively to collected sets of estimates. In case the results to be amalgamated differ substantially (in their contexts or in their estimated effects), a derived meta-analytic average may eventually not correspond to a reasonable estimand.[10][11] When individual studies exhibit conflicting results, there likely are some reasons why the results differ; for instance, two subpopulations may experience different pharmacokinetic pathways.[12] In such a scenario, it would be important to both know and consider relevant covariables in an analysis.

Testing

Statistical testing for a non-zero heterogeneity variance is often done based on Cochran's Q[13] or related test procedures. This common procedure however is questionable for several reasons, namely, the low power of such tests[14] especially in the very common case of only few estimates being combined in the analysis,[15][7] as well as the specification of homogeneity as the null hypothesis which is then only rejected in the presence of sufficient evidence against it.[16]

Estimation

While the main purpose of a meta-analysis usually is estimation of the main effect, investigation of the heterogeneity is also crucial for its interpretation. A large number of (frequentist and Bayesian) estimators is available.[17] Bayesian estimation of the heterogeneity usually requires the specification of an appropriate prior distribution.[9][18]

While many of these estimators behave similarly in case of a large number of studies, differences in particular arise in their behaviour in the common case of only few estimates.[19] An incorrect zero between-study variance estimate is frequently obtained, leading to a false homogeneity assumption. Overall, it appears that heterogeneity is being consistently underestimated in meta-analyses.[7]

Quantification

The heterogeneity variance is commonly denoted by τ², or the standard deviation (its square root) by τ. Heterogeneity is probably most readily interpretable in terms of τ, as this is the heterogeneity distribution's scale parameter, which is measured in the same units as the overall effect itself.[18]

Another common measure of heterogeneity is I², a statistic that indicates the percentage of variance in a meta-analysis that is attributable to study heterogeneity (somewhat similarly to a coefficient of determination).[20] I² relates the heterogeneity variance's magnitude to the size of the individual estimates' variances (squared standard errors); with this normalisation however, it is not quite obvious what exactly would constitute "small" or "large" amounts of heterogeneity. For a constant heterogeneity (τ), the availability of smaller or larger studies (with correspondingly differing standard errors associated) would affect the I² measure; so the actual interpretation of an I² value is not straightforward.[21] [22]

The joint consideration of a prediction interval along with a confidence interval for the main effect may help getting a better sense of the contribution of heterogeneity to the uncertainty around the effect estimate.[5][23][24][25]

See also

References

  1. 1.0 1.1 Deeks, J.J.; Higgins, J.P.T.; Altman, D.G. (2021), "10.10 Heterogeneity", in Higgins, J.P.T.; Thomas, J.; Chandler, J. et al., Cochrane Handbook for Systematic Reviews of Interventions (6.2 ed.), https://training.cochrane.org/handbook/current/chapter-10#section-10-10 
  2. Singh, A.; Hussain, S.; Najmi, A.N. (2017), "Number of studies, heterogeneity, generalisability, and the choice of method for meta-analysis", Journal of the Neurological Sciences 15 (381): 347, doi:10.1016/j.jns.2017.09.026, PMID 28967410 
  3. Deeks, J.J.; Altman, D.G. (2001), "Effect measures for meta-analysis of trials with binary outcomes", Systematic reviews in health care: Meta-analysis in context (2nd ed.), BMJ Publishing, pp. 313–335, doi:10.1002/9780470693926.ch16, ISBN 9780470693926 
  4. Cooper, Harris; Hedges, Larry V.; Valentine, Jeffrey C. (2019-06-14) (in en). The Handbook of Research Synthesis and Meta-Analysis. Russell Sage Foundation. ISBN 978-1-61044-886-4. https://books.google.com/books?id=tfeXDwAAQBAJ&dq=The+Handbook+of+research+synthesis+and+meta-analysis&pg=PR5. 
  5. 5.0 5.1 Riley, R. D.; Higgins, J. P.; Deeks, J. J. (2011), "Interpretation of random-effects meta-analyses", BMJ 342: d549, doi:10.1136/bmj.d549, PMID 21310794 
  6. Bretthorst, G.L. (1999), "The near-irrelevance of sampling frequency distributions", Maximum Entropy and Bayesian methods, Kluwer Academic Publishers, pp. 21–46, doi:10.1007/978-94-011-4710-1_3, ISBN 978-94-010-5982-4 
  7. 7.0 7.1 7.2 Kontopantelis, E.; Springate, D. A.; Reeves, D. (2013). "A re-analysis of the Cochrane Library data: The dangers of unobserved heterogeneity in meta-analyses". PLOS ONE 8 (7): e69930. doi:10.1371/journal.pone.0069930. PMID 23922860. Bibcode2013PLoSO...869930K. 
  8. Kontopantelis, E.; Reeves, D. (2012). "Performance of statistical methods for meta-analysis when true study effects are non-normally distributed: A simulation study.". Statistical Methods in Medical Research 21 (4): 409–26. doi:10.1177/0962280210392008. PMID 21148194. https://www.research.manchester.ac.uk/portal/en/publications/performance-of-statistical-methods-for-metaanalysis-when-true-study-effects-are-nonnormally-distributed-a-simulation-study(2634dc5c-0dbe-47d2-83ce-d58277108743).html. 
  9. 9.0 9.1 Röver, C. (2020), "Bayesian random-effects meta-analysis using the bayesmeta R package", Journal of Statistical Software 93 (6): 1–51, doi:10.18637/jss.v093.i06 
  10. Cornell, John E.; Mulrow, Cynthia D.; Localio, Russell; Stack, Catharine B.; Meibohm, Anne R.; Guallar, Eliseo; Goodman, Steven N. (2014-02-18). "Random-Effects Meta-analysis of Inconsistent Effects: A Time for Change". Annals of Internal Medicine 160 (4): 267–270. doi:10.7326/M13-2886. ISSN 0003-4819. PMID 24727843. https://www.acpjournals.org/doi/abs/10.7326/M13-2886. 
  11. Maziarz, Mariusz (2022-02-01). "Is meta-analysis of RCTs assessing the efficacy of interventions a reliable source of evidence for therapeutic decisions?" (in en). Studies in History and Philosophy of Science 91: 159–167. doi:10.1016/j.shpsa.2021.11.007. ISSN 0039-3681. PMID 34922183. 
  12. Borenstein, Michael; Hedges, Larry V.; Higgins, Julian P. T.; Rothstein, Hannah R. (2010). "A basic introduction to fixed-effect and random-effects models for meta-analysis" (in en). Research Synthesis Methods 1 (2): 97–111. doi:10.1002/jrsm.12. ISSN 1759-2887. PMID 26061376. https://onlinelibrary.wiley.com/doi/abs/10.1002/jrsm.12. 
  13. Cochran, W.G. (1954), "The combination of estimates from different experiments", Biometrics 10 (1): 101–129, doi:10.2307/3001666 
  14. Hardy, R.J.; Thompson, S.G. (1998), "Detecting and describing heterogeneity in meta-analysis", Statistics in Medicine 17 (8): 841–856, doi:10.1002/(SICI)1097-0258(19980430)17:8<841::AID-SIM781>3.0.CO;2-D, PMID 9595615 
  15. Davey, J.; Turner, R.M.; Clarke, M.J.; Higgins, J.P.T. (2011), "Characteristics of meta-analyses and their component studies in the Cochrane Database of Systematic Reviews: a cross-sectional, descriptive analysis", BMC Medical Research Methodology 11 (1): 160, doi:10.1186/1471-2288-11-160, PMID 22114982 
  16. Li, W.; Liu, F.; Snavely, D. (2020), "Revisit of test‐then‐pool methods and some practical considerations", Pharmaceutical Statistics 19 (5): 498–517, doi:10.1002/pst.2009, PMID 32171048 
  17. Veroniki, A.A.; Jackson, D.; Viechtbauer, W.; Bender, R.; Bowden, J.; Knapp, G.; Kuß, O.; Higgins, J.P.T. et al. (2016), "Methods to estimate the between-study variance and its uncertainty in meta-analysis", Research Synthesis Methods 7 (1): 55–79, doi:10.1002/jrsm.1164, PMID 26332144 
  18. 18.0 18.1 Röver, C.; Bender, R.; Dias, S.; Schmid, C.H.; Schmidli, H.; Sturtz, S.; Weber, S.; Friede, T. (2021), "On weakly informative prior distributions for the heterogeneity parameter in Bayesian random‐effects meta‐analysis", Research Synthesis Methods 12 (4): 448–474, doi:10.1002/jrsm.1475, PMID 33486828 
  19. Friede, T.; Röver, C.; Wandel, S.; Neuenschwander, B. (2017), "Meta-analysis of few small studies in orphan diseases", Research Synthesis Methods 8 (1): 79–91, doi:10.1002/jrsm.1217, PMID 27362487 
  20. Higgins, J. P. T.; Thompson, S. G.; Deeks, J. J.; Altman, D. G. (2003), "Measuring inconsistency in meta-analyses", BMJ 327 (7414): 557–560, doi:10.1136/bmj.327.7414.557, PMID 12958120 
  21. Rücker, G.; Schwarzer, G.; Carpenter, J.R.; Schumacher, M. (2008), "Undue reliance on I² in assessing heterogeneity may mislead", BMC Medical Research Methodology 8 (79): 79, doi:10.1186/1471-2288-8-79, PMID 19036172 
  22. Borenstein, M.; Higgins, J.P.T.; Hedges, L.V.; Rothstein, H.R. (2017), "Basics of meta-analysis: I² is not an absolute measure of heterogeneity", Research Synthesis Methods 8 (1): 5–18, doi:10.1002/jrsm.1230, PMID 28058794, https://research-information.bris.ac.uk/ws/files/88672054/Final_I_squared_paper_full.pdf 
  23. Chiolero, A; Santschi, V.; Burnand, B.; Platt, R.W.; Paradis, G. (2012), "Meta-analyses: with confidence or prediction intervals?", European Journal of Epidemiology 27 (10): 823–5, doi:10.1007/s10654-012-9738-y, PMID 23070657, http://doc.rero.ch/record/320413/files/10654_2012_Article_9738.pdf 
  24. Bender, R.; Kuß, O.; Koch, A.; Schwenke, C.; Hauschke, D. (2014), Application of prediction intervals in meta-analyses with random effects, Joint statement of IQWiG, GMDS and IBS-DR, https://www.iqwig.de/download/2014-03-07_Joint_Statement_Prediction_Intervals.pdf 
  25. IntHout, J; Ioannidis, J.P.A.; Rovers, M.M.; Goeman, J.J. (2016), "Plea for routinely presenting prediction intervals in meta-analysis", BMJ Open 6 (7): e010247, doi:10.1136/bmjopen-2015-010247, PMID 27406637, PMC 4947751, https://bmjopen.bmj.com/content/bmjopen/6/7/e010247.full.pdf 

Further reading