Chapter 30.

Analyzing longitudinal data has important advantages over cross-sectional data, because it can distinguish between changes that occur within subjects and differences between subjects. In cross-sectional data, subjects are measured at a particular moment in time. The differences in characteristics of the subjects are the only source of variation that can be used to explain the outcome of scientific interest. The standard ordinary least squares (OLS) linear regression technique is often sufficient to describe and to make inferences about the relationship between variables in cross-sectional data. There are a numerous number of excellent standard textbooks about this topic at a basic (e.g. Kleinbaum et al., 1998) and advanced level (e.g. Weisberg, 1985).
Statistical analysis methods for longitudinal data should be able to distinguish between the two sources of variation, i.e. the within-subjects variation that accounts for changes within each subject through repeated measurements in time, and the between-subjects variation accounting for differences between the subject’s performances. The OLS linear regression technique is in general not suitable to analyze longitudinal data, because it incorrectly treats all observations as if they were uncorrelated.

Longitudinal data are correlated because of the hierarchical structure of data sampling, i.e. repeated measurements are nested within subjects. Another type of correlation between repeated measurements could arise when e.g. each subject is measured by an observer (possibly self reporting) whose measurement is influenced by the previous measurement on the same subject. This type of correlation is known as serial correlation, which is different than the former type. One common and appropriate method of analyzing this type of data is repeated measures ANOVA or MANOVA (see e.g. Hand and Taylor (1987) for an introduction to this topic). However, multi-level modeling (often referred to as Hierarchical Linear Modeling or mixed effects regression) has significant fundamental advantages over RMANOVA and therefore constitutes a best practice.

 

Use the following data sets, provided by the chapter author, to reinforce your understanding of the chapter by working through the examples.

Data set #1:growth data

Data set #2: teacher data