Skip to main content

Linear Mixed Effects Models I

Here we'll be considering an alternative approach for analyzing longitudinal data using linear mixed effects models. These models have some subset of the individual parameters vary randomly from one subject to another, thereby accounting for sources of natural heterogeneity in the population.

  • Fixed effects are the population characteristics that are assumed to be shared by all subjects
  • Random effects are characteristics, unique for each subject
  • The combination of both gives rise to mixed effect models

We can write the linear mixed effects model as:
Yi = Xi β + Zibi + ei
Where:

  • β is a s*1 vector of fixed effects
  • bi is a q*1 vector of random effects (which follows  ~N(0, G) we will see below)
  • ei is an ni*1 vector of within subject errors (~N(0, Ri)),
  • Xi is an ni*s matrix of covariates, 
  • Zi is a ni*q matrix of covariates with q <= s

We usually assume Ri = σ2In, where In is an ni*ni identity matrix with bi and ei mutually independent.

Model Formulation

The simplest mixed effects model for longitudinal data is:
image.png
Where bi (some random intercept) and eij (error) are independent.

The marginal mean response in the population is:
image.png
The response for the ith subject at the jth measurement occasion differs from the population mean response by a subject effect bi, and a within subject measurement error eij.

The conditional mean response for any individual:
image.png

The residuals of profile analysis and parametric curves (ϵij) and parametric curves such that ϵij = bi + eij

Variance and Covariance Structure

image.png
Similarly, the marginal covariance between any pair of responses:
image.png

With some linear algebra we can write the variance-covariance matrix as follows:
image.png

Where ρ is called the intra-cluster correlation coefficient. For ρ > 0, the random intercept mixed effects model is observationally equivalent to a compound symmetry model.

Random Intercept and Slope Model

As the name suggests, this adds a random effect for both the slope and intercept.

Consider the following model:
image.png
Where
image.png

So the variance of each response varies with the time it was taken:
image.png
And likewise the covariance between any pair of responses:
image.png

Comparing Mixed Effects Models

Inference with respect to random and fixed effects can be carried out using methods described in the previous lecture for the marginal models. For the nested models, a LR test with REML may be used (which follows a chi-squared dist with 2 degrees of freedom) while non-nested models use AIC or BIC.

Empirical BLUP

In some cases we are interested in subject-specific rather than population-averaged predictions. Mixed effect models are appropriate for this analysis. The empirical best linear unbiased predictor (BLUP) of bi conditional on the observed vector of responses Yi is given by:
image.png
image.png

Given the empirical BLUP, b^i the ith subject's predicted mean is:
image.png

  • The empirical BLUP is a weighted average of the population-averaged mean Xibeta_hat and the observed response of the ith subject, Yi
  • The empirical BLUP shrinks the ith subject's predicted mean towards the population-averaged mean
  • The amount  of shrinkage depends on teh relative magntude of the within-subject variability Ri relative to the sum from i, varying weight is assigned to the population-averaged mean Xibeta_hat

Conclusion

  • Unlike the models we have seen up to now, mixed effects models explicitly distinguish between-subject and within-subject sources of variability.
  • The induced covariance structure is very parsimonious, with only a few number of parameters
  • They can be used to predict subject-specific mean response trajectories over time
  • They are very flexible in accommodating any kind of imbalance in the data, including the number and timing of measurement occasions.

SAS Code

/************************************************
* Six cities study of air pollution and health	*
*************************************************/
data fev1;
set s857.fev1;
lgfevht=Log_FEV1_-logHgt;
t=age;
run;
proc sgplot data=fev1  noautolegend ;
	* spaghetti plot;
	yaxis min = -0.3 max = 1;
	reg x=age y=lgfevht 
    / group = id nomarkers LINEATTRS = (COLOR= gray PATTERN = 1 THICKNESS = 1) ;
	* overall spline;
	reg x=age y=lgfevht 
    / nomarkers LINEATTRS = (COLOR= red PATTERN = 1 THICKNESS = 3) ;
run;
quit;
proc mixed data=fev1 covtest;
class t;
model Log_FEV1_=age logHgt Init_Age logInitHgt/s chisq;
repeated  t/type=cs subject=id R Rcorr;
run;
proc mixed data=fev1 covtest method=REML;
model Log_FEV1_=age logHgt Init_Age logInitHgt/s chisq ;
random intercept /type=un subject=id G V;
run;
proc mixed data=fev1 covtest method=REML;
model Log_FEV1_=age logHgt Init_Age logInitHgt/s chisq;
random intercept age/type=un subject=id G V;
run;

proc mixed data=fev1 covtest;
model Log_FEV1_=age logHgt Init_Age logInitHgt/s chisq;
random intercept logHgt/type=un subject=id G V;
run;


proc mixed data=fev1 covtest method=REML;
model Log_FEV1_=age logHgt Init_Age logInitHgt/s chisq;
random  intercept age logHgt/type=un subject=id G V;
run;

/************************************************
* Study of influence of menarche on changes in body fat *
*************************************************/

proc sgplot data=s857.fat  noautolegend ;
	* spaghetti plot;
	yaxis min = 0 max = 50;
	reg x=time_men y=Perc_BF 
    / group = id nomarkers LINEATTRS = (COLOR= gray PATTERN = 1 THICKNESS = 1) ;
	* overall spline;
	reg x=time_men y=Perc_BF
    / nomarkers LINEATTRS = (COLOR= red PATTERN = 1 THICKNESS = 3) ;
run;
quit;

   ods graphics on; 

   proc loess data=s857.fat plots=all;
      model Perc_BF = time_men;
   run;
 ods graphics off;

 data fat;
 set s857.fat;
 knot=max(time_men,0);
 run;
ods output solutionr=bluptable;
*ods trace on;
*ods listing;
 proc mixed data=fat covtest;
 model Perc_BF = time_men knot/s chisq outpred=yhat outpm = pred1f;
 random intercept time_men knot/type=un subject=id G V solution;
run;
*ods trace off;
ods close;

data yhat;
set yhat;
rename Pred=Pred_r;
run;
data Prediction;
merge yhat pred1f;
keep id pred pred_r time_men perc_BF;
run;

data prediction_long;
length Scale $20;
set prediction;
y=pred;Scale="Average prediction";output;
y=pred_r;Scale="Subject prediction";output;
y=perc_bf;Scale="Raw Data";output;
run;
goptions reset=all;
symbol1 c=blue  v=star   h=.8 i=j      w=5;
symbol2 c=red   v=dot    h=.8 i=j      w=1;
symbol3 c=green   v=squarefilled    h=.8 i=j      w=1;
axis1 order=(0 to 50 by 10) label=( 'Predicted % Body Fat');

proc gplot data=Prediction_long;
where id=10;
plot y*time_men=Scale/ vaxis=axis1 ;
run;
quit;

/************************************************
* Repeated CD4 counts data from AIDS clinical trial.*
*************************************************/
data CD4;
set s857.CD4;
knot=max(week-16,0);
if trt=. then treatment=.;
else if trt<4 then treatment=1;
else treatment=2;
run;
 ods graphics on; 

proc transreg ss2 data=cd4;
      model identity(log_cd4) = class(treatment / zero=none) *
                          smooth(week / sm=80);
      output p;
   run;
  
 ods graphics off;
proc mixed data=cd4 covtest;
class treatment;
model log_cd4=week knot treatment*week treatment*knot/s chisq;
random intercept  week knot/type=un subject=id G V;
run;

proc mixed data=cd4 covtest;
class treatment;
model log_cd4= week knot treatment*week treatment*knot/s chisq;
random intercept  week knot/type=un subject=id G V;
contrast'Interaction test' treatment*week 1 -1, 
							treatment*knot 1 -1;
contrast 'Slope test' treatment*week 1 -1  treatment*knot 1 -1;
run;

proc mixed data=cd4 covtest;
class treatment;
model log_cd4=week knot treatment*week treatment*knot Age Gender/s chisq outpred=random;
random intercept  week knot/type=un subject=id G V solution;
ods output solutionr=sr(keep=effect subject estimate probt);
run;
proc print data=sr;
where effect='Week' and Estimate<0 and probt<0.05;
run;