Multilevel modeling for repeated measures

From HandWiki

One application of multilevel modeling (MLM) is the analysis of repeated measures data. Multilevel modeling for repeated measures data is most often discussed in the context of modeling change over time (i.e. growth curve modeling for longitudinal designs); however, it may also be used for repeated measures data in which time is not a factor.[1] In multilevel modeling, an overall change function (e.g. linear, quadratic, cubic etc.) is fitted to the whole sample and, just as in multilevel modeling for clustered data, the slope and intercept may be allowed to vary. For example, in a study looking at income growth with age, individuals might be assumed to show linear improvement over time. However, the exact intercept and slope could be allowed to vary across individuals (i.e. defined as random coefficients).

Multilevel modeling with repeated measures employs the same statistical techniques as MLM with clustered data. In multilevel modeling for repeated measures data, the measurement occasions are nested within cases (e.g. individual or subject). Thus, level-1 units consist of the repeated measures for each subject, and the level-2 unit is the individual or subject. In addition to estimating overall parameter estimates, MLM allows regression equations at the level of the individual. Thus, as a growth curve modeling technique, it allows the estimation of inter-individual differences in intra-individual change over time by modeling the variances and covariances.[2] In other words, it allows the testing of individual differences in patterns of responses over time (i.e. growth curves). This characteristic of multilevel modeling makes it preferable to other repeated measures statistical techniques such as repeated measures-analysis of variance (RM-ANOVA) for certain research questions.

Assumptions

The assumptions of MLM that hold for clustered data also apply to repeated measures:

(1) Random components are assumed to have a normal distribution with a mean of zero
(2) The dependent variable is assumed to be normally distributed. However, binary and discrete dependent variables may be examined in MLM using specialized procedures (i.e. employ different link functions).[3]

One of the assumptions of using MLM for growth curve modeling is that all subjects show the same relationship over time (e.g. linear, quadratic etc.). Another assumption of MLM for growth curve modeling is that the observed changes are related to the passage of time.[4]

Statistics & Interpretation

Mathematically, multilevel analysis with repeated measures is very similar to the analysis of data in which subjects are clustered in groups. However, one point to note is that time-related predictors must be explicitly entered into the model to evaluate trend analyses and to obtain an overall test of the repeated measure. Furthermore, interpretation of these analyses is dependent on the scale of the time variable (i.e. how it is coded).

  • Fixed Effects: Fixed regression coefficients may be obtained for an overall equation that represents how, averaging across subjects, the subjects change over time.
  • Random Effects: Random effects are the variance components that arise from measuring the relationship of the predictors to Y for each subject separately. These variance components include: (1) differences in the intercepts of these equations at the level of the subject; (2) differences across subjects in the slopes of these equations; and (3) covariance between subject slopes and intercepts across all subjects. When random coefficients are specified, each subject has its own regression equation, making it possible to evaluate whether subjects differ in their means and/or response patterns over time.
  • Estimation Procedures & Comparing Models: These procedures are identical to those used in multilevel analysis where subjects are clustered in groups.

Extensions

  • Modeling Non-Linear Trends (Polynomial Models):
  • Non-linear trends (quadratic, cubic, etc.) may be evaluated in MLM by adding the products of Time (TimeXTime, TimeXTimeXTime etc.) as either random or fixed effects to the model.
  • Adding Predictors to the Model: It is possible that some of the random variance (i.e. variance associated with individual differences) may be attributed to fixed predictors other than time. Unlike RM-ANOVA, multilevel analysis allows the use of continuous predictors (rather than only categorical), and these predictors may or may not account for individual differences in the intercepts as well as for differences in slopes. Furthermore, multilevel modeling also allows time-varying covariates.
  • Alternative Specifications:
  • Covariance Structure: Multilevel software provides several different covariance or error structures to choose from for the analysis of multilevel data (e.g. autoregressive). These may be applied to the growth model as appropriate.
  • Dependent Variable: Dichotomous dependent variables may be analyzed with multilevel analysis by using more specialized analysis (i.e. using the logit or probit link functions).

Multilevel modeling versus other statistical techniques for repeated measures

Multilevel Modeling versus RM-ANOVA

Repeated measures analysis of variance (RM-ANOVA) has been traditionally used for analysis of repeated measures designs. However, violation of the assumptions of RM-ANOVA can be problematic. Multilevel modeling (MLM) is commonly used for repeated measures designs because it presents an alternative approach to analyzing this type of data with three main advantages over RM-ANOVA:[5]

1. MLM has Less Stringent Assumptions: MLM can be used if the assumptions of constant variances (homogeneity of variance, or homoscedasticity), constant covariances (compound symmetry), or constant variances of differences scores (sphericity) are violated for RM-ANOVA. MLM allows modeling of the variance-covariance matrix from the data; thus, unlike in RM-ANOVA, these assumptions are not necessary.[6]
2. MLM Allows Hierarchical Structure: MLM can be used for higher-order sampling procedures, whereas RM-ANOVA is limited to examining two-level sampling procedures. In other words, MLM can look at repeated measures within subjects, within a third level of analysis etc., whereas RM-ANOVA is limited to repeated measures within subjects.
3. MLM can Handle Missing Data: Missing data is permitted in MLM without causing additional complications. With RM-ANOVA, subject’s data must be excluded if they are missing a single data point. Missing data and attempts to resolve missing data (i.e. using the subject’s mean for non-missing data) can raise additional problems in RM-ANOVA.
4. MLM can also handle data in which there is variation in the exact timing of data collection (i.e. variable timing versus fixed timing). For example, data for a longitudinal study may attempt to collect measurements at age 6 months, 9 months, 12 months, and 15 months. However, participant availability, bank holidays, and other scheduling issues may result in variation regarding when data is collected. This variation may be addressed in MLM by adding “age” into the regression equation. There is also no need for equal intervals between measurement points in MLM.
5. MLM is relatively easily extended to discrete data. [7]
Note: Although missing data is permitted in MLM, it is assumed to be missing at random. Thus, systematically missing data can present problems.[5][8][9]

Multilevel Modeling versus Structural Equation Modeling (SEM; Latent Growth Model)

An alternative method of growth curve analysis is latent growth curve modeling using structural equation modeling (SEM). This approach will provide the same estimates as the multilevel modeling approach, provided that the model is specified identically in SEM. However, there are circumstances in which either MLM or SEM are preferable:[4][6]

Multilevel modeling approach:
  • For designs with a large number of unequal intervals between time points (SEM cannot manage data with a lot of variation in time points)
  • When there are many data points per subject
  • When the growth model is nested in additional levels of analysis (i.e. hierarchical structure)
  • Multilevel modeling programs have for more options in terms of handling non-continuous dependent variables (link functions) and allowing different error structures
Structural equation modeling approach:
  • Better suited for extended models in which the model is embedded into a larger path model, or the intercept and slope are used as predictors for other variables. In this way, SEM allows greater flexibility.

The distinction between multilevel modeling and latent growth curve analysis has become less defined. Some statistical programs incorporate multilevel features within their structural equation modeling software, and some multilevel modeling software is beginning to add latent growth curve features.

Data Structure

Multilevel modeling with repeated measures data is computationally complex. Computer software capable of performing these analyses may require data to be represented in “long form” as opposed to “wide form” prior to analysis. In long form, each subject’s data is represented in several rows – one for every “time” point (observation of the dependent variable). This is opposed to wide form in which there is one row per subject, and the repeated measures are represented in separate columns. Also note that, in long form, time invariant variables are repeated across rows for each subject. See below for an example of wide form data transposed into long form:

Wide form:

Subject Group Time0 Time1 Time2
1 1 12 8 4
2 1 11 7 6
3 2 15 12 10
4 2 11 10 9

Long form:

Subject Group Time DepVar
1 1 0 12
1 1 1 8
1 1 2 4
... ... ... ...
4 2 0 11
4 2 1 10
4 2 2 9

See also

Further reading

  • Heo, Moonseong; Faith, Myles S.; Mott, John W.; Gorman, Bernard S.; Redden, David T.; Allison, David B. (2003). "Hierarchical linear models for the development of growth curves: an example with body mass index in overweight/obese adults". Statistics in Medicine 22 (11): 1911–1942. doi:10.1002/sim.1218. PMID 12754724. 
  • Singer, J. D. (1998). "Using SAS PROC MIXED to Fit Multilevel Models, Hierarchical Models, and Individual Growth Models". Journal of Educational and Behavioral Statistics 23 (4): 323–355. doi:10.3102/10769986023004323. 
  • Willett, Judith D. Singer, John B. (2003). Applied longitudinal data analysis : modeling change and event occurrence. Oxford: Oxford University Press. ISBN 978-0195152968.  Concentrates on SAS and on simpler growth models.
  • Snijders, Tom A.B.; Bosker, Roel J. (2002). Multilevel analysis : an introduction to basic and advanced multilevel modeling (Reprint. ed.). London: Sage Publications. ISBN 978-0761958901. 
  • Hedeker, Donald (2006). Longitudinal data analysis. Hoboken, N.J: Wiley-Interscience. ISBN 978-0471420279.  Covers many models and shows the advantages of MLM over other approaches
  • Verbeke, Geert (2013). Linear mixed models for longitudinal data. S.l: Springer-Verlag New York. ISBN 978-1475773842.  Has extensive SAS code.
  • Molenberghs, Geert (2005). Models for discrete longitudinal data. New York: Springer Science+Business Media, Inc. ISBN 978-0387251448.  Covers non-linear models. Has SAS code.
  • Pinheiro, Jose; Bates, Douglas M. (2000). Mixed-effects models in S and S-PLUS. New York, NY u.a: Springer. ISBN 978-1441903174.  Uses S and S-plus but will be useful for R users as well.

Notes

  1. Hoffman, Lesa; Rovine, Michael J. (2007). "Multilevel models for the experimental psychologist: Foundations and illustrative examples". Behavior Research Methods 39 (1): 101–117. doi:10.3758/BF03192848. PMID 17552476. 
  2. Curran, Patrick J.; Obeidat, Khawla; Losardo, Diane (2010). "Twelve Frequently Asked Questions About Growth Curve Modeling". Journal of Cognition and Development 11 (2): 121–136. doi:10.1080/15248371003699969. PMID 21743795. 
  3. Snijders, Tom A.B.; Bosker, Roel J. (2002). Multilevel analysis : an introduction to basic and advanced multilevel modeling (Reprint. ed.). London: Sage Publications. ISBN 978-0761958901. 
  4. 4.0 4.1 Hox, Joop (2005). Multilevel and SEM Approached to Growth Curve Modeling ([Repr.]. ed.). Chichester: Wiley. ISBN 978-0-470-86080-9. http://joophox.net/publist/ebs05.pdf. 
  5. 5.0 5.1 Quené, Hugo; van den Bergh, Huub (2004). "On multi-level modeling of data from repeated measures designs: a tutorial". Speech Communication 43 (1–2): 103–121. doi:10.1016/j.specom.2004.02.004. 
  6. 6.0 6.1 Cohen, Jacob; Cohen, Patricia; West, Stephen G.; Aiken, Leona S. (2003-10-03). Applied multiple regression/correlation analysis for the behavioral sciences (3. ed.). Mahwah, NJ [u.a.]: Erlbaum. ISBN 9780805822236. 
  7. Molenberghs, Geert (2005). Models for discrete longitudinal data. New York: Springer Science+Business Media, Inc. ISBN 978-0387251448. 
  8. Overall, John E.; Tonidandel, Scott (2007). "Analysis of Data from a Controlled Repeated Measurements Design with Baseline-Dependent Dropouts". Methodology: European Journal of Research Methods for the Behavioral and Social Sciences 3 (2): 58–66. doi:10.1027/1614-2241.3.2.58. 
  9. Overall, John; Ahn, Chul; Shivakumar, C.; Kalburgi, Yallapa (1999). "Problematic formulations of SAS PROC.MIXED models for repeated measurements". Journal of Biopharmaceutical Statistics 9 (1): 189–216. doi:10.1081/BIP-100101008. PMID 10091918. 

References