Generalized Linear Mixed Effects Models

Generalized Linear Mixed Models (GLMMs) are an extension of linear mixed models to allow response variables from different distributions (such as binary or multi-nomial responses). Think of it as an extension of generalized linear models (e.g. logistic regression) to include both fixed and random effects.

image.png

In classical statistics we do not actually estimate the vector of random effects; we nearly always assume that for the jth element of vector uj ~ N(0, G); Where G is the variance-covariance matrix of the random effects. Recall that the variance-covariance is always square, symmetric and positive semi-definite; This means for a q*q matrix there are q(q+1)/2 unique elements.

Because we directly estimated the fixed effects the random effect complements (Z) are modeled as deviations from the fixed effects with mean 0. The random effects are just deviations around the value in  β (which is the mean). The only thing left to estimate is the variance. In a model with only a random intercept G is a 1*1 matrix (the variance of the random intercept). If we had a random intercept and a slope G would look like:
image.png

So far everything we've covered applied to both linear mixed models and generalized linear mixed models. What's different about these is that in GLMM the response variables can come from different distributions besides Gaussian. Additionally, rather than modeling the responses directly some link function (referred toi as g(.) and its inverse as h()) is often applied, such as log link. For example, in count outcomes are a special case in which we use a log link function and the probability mass function:
image.png

Interpretation

The interpretation of GLMMs is similar to GLMs. Often we want to transform our results to the original rather than the link function result;  this is where transformations complicate matters because they are nonlinear and so even random intercepts no longer. Consider the example below of a mixed effects logistic model predicting remission:

image.png
The estimates can be interpreted as always, where the estimated effect of the parameter is interpreted as a change in the log odds. However, we run into issues trying to interpret the odds ratios. Odds ratios take on a more complex meaning when there are mixed effects, as in a regular logistic model we assume all other effects are fixed. Thus we must interpret the odds ratio here as a conditional odds ratio for holding the remaining factors constant.

Estimation

For parameter estimation, there are no closed form solutions for GLMMs you must use some approximation

Gauss-Hermite quadrature has limitations; The number of function evaluations required grows exponentially as the number of dimensions increases. A random intercept is one dimension, adding a random slope would be two. For three level models with random intercepts and slopes, it is easy to create problems that are intractable with Gaussian quadrature. Consequently, it is a useful method when a high degree of accuracy is desired but performs poorly in high dimensional spaces, for large datasets, or if speed is a concern. Additionally, if the outcome is skewed there can also be problems with the random effects.

SAS Code

libname S857 'Z:\';

data amenorrhea;
set s857.amenorrhea;
t=time;
time2=time**2;
run;

proc sort data=amenorrhea;
by descending trt;
run;
title1 'Marginal Models';
title2 'Clinical Trial of Contracepting Women';

proc genmod data=amenorrhea descending;
class id trt (ref='0') t/param=ref;
model y = time time2 trt*time trt*time2 /dist=binomial link=logit type3 wald;
     repeated subject=id / withinsubject=t logor=fullclust;
store p1;
run;
ods graphics on;
ods html style=journal;
proc plm source=p1;
  score data = amenorrhea out=pred /ilink;
run;
proc sort data = pred;
  by trt time;
run;
proc sgplot data = pred;
  series x = time y = predicted /group=trt;
run;
ods graphics off;

title1 'Generalized Linear Mixed Effects Models';

proc glimmix  data=amenorrhea  method=quad(qpoints=5) empirical order=data ;
class id trt ;
model y = time time2 trt*time trt*time2 /dist=binomial link=logit s oddsratio;
random intercept time/subject=id type=un;
run;

title1 'Generalized Linear Mixed Effects Models';
title2 'Clinical Trial of AntiEpileptic-Drug';
data epilepsy;
set s857.epilepsy;
time=0;y=y0;output;
time=1;y=y1;output;
time=2;y=y3;output;
time=3;y=y3;output;
time=4;y=y4;output;
drop y0 y1 y2 y3 y4;
run;
data epilepsy;
set epilepsy;
if time=0 then t=log(8);
else t=log(2);
run;
proc sort data=epilepsy;
by descending trt;
run;
proc glimmix  data=epilepsy  method=quad(qpoints=50) empirical order=data ;
class id trt ;
model y = time trt trt*time /dist=poisson link=log s offset=t;
random intercept time/subject=id type=un;
run;
ods rtf close;

Revision #4
Created 7 March 2023 21:51:32 by Elkip
Updated 8 March 2023 15:11:28 by Elkip