Model Fit & Concepts of Interaction

Review: Logistic and Proportional Hazards Regression Model Selection

You can look at changes in the deviance (-2 log likelihood change)
image-1668437735242.png

Risk Prediction and Model Performance

How well does this model predict whether a person will have the outcome? Generally only for dichotomous outcomes, especially in genetics.

Logistic

Hosmer-Lemshow Test

The Hosmer-Lemshow Test is a statistical goodness of fit test for logistic regression models. It is frequently used in risk prediction models. It assesses whether or not the observed event rates match expected event rates in subgroups of the model population of size n. Specifically, it identifies subgroups as the deciles of size nj based on fitted risk values.

H0: The observed and expected proportions are the same across all groups

image-1668439254340.png

ROC: Receiver Operating Curve / C Statistic

Plots sensitivity (true positive) for different decisions and look for best trade off between sensitivity and specificity (true negative). The curve is generated using signal detection applications.
image-1668440931670.png

The area under the curve (AUC) is a summary measure called the c statistic, which is the probability that a randomly chosen subject with the event will have a higher predicted probability of the event than a randomly chosen subject without the event (a measure of discrimination).

image-1668441491781.png

The c statistic groups all pairs of subjects with different outcomes and identifies pairs where the subject with the higher predicted value also has the higher outcome concordantly. Pairs where the subject with the higher predicted value has the lower outcome are discordant.

When we use the c-statistic in the data used to build the model it cannot be interpreted as the "true" predictive accuracy. It is simply a measure of goodness of fit.

Real predictive accuracy can be estimated when you have a new data set that is not used to generate the model. If that is not possible, inter-validation can be considered:

Survival Analysis

Interaction Analysis

Interaction is when the effect of an exposure depends on the presence or absence of another exposure, or on the level of another exposure variable.

Effect Modification

There is a difference between interaction and effect modification but I will not go into detail here.

One 'exposure' is a non-modifiable background variable such as a demographic variable (a 'moderator'). The moderator affects the size and/or direction of the association between a primary exposure and an outcome.

Ex. Sex may be a moderator of the association between hypertension and heart disease, as reflected by the difference in risk ratios between men and women.

Statistical vs. Biological Interaction

Quantitative Interaction

Quantitative interaction is when the direction of the effect of an exposure on an outcome is the same for different subgroups but the size of the effect differs.

image-1668444157006.png

Qualitative Interaction

Qualitative interaction is when the direction of the effect of an exposure on an outcome changes for different subgroups.

image-1668444230634.png

Additive Interaction

Note: In the following sections we ignore sampling variability. We'll assume we have very, very large samples or an entire population.

Suppose we have two binary exposures A and B with risk:
image-1668444575522.png
There is interaction on the additive scale if the effect of the two exposure together is not equal to the sum:
image-1668444641516.png

Example: Suppose we have two binary exposures A and B that have the following risk table:
image-1668444554657.png
The risk difference compared to those with neither exposure:
image-1668444384936.png
Note that RD11 > RD10 + RD01 (Synergistic)

Risk Ratio Difference

Recall that a risk ratio of 1 means no risk.

For risk ratios: There is an interaction on the additive scale if the effect of the two exposures together is not equal to the sum of their individual effects:
image-1668444840216.png

Thus, there is an interaction on the additive scale if the deviation from 1 (the null value) of the risk ratio for both exposures is not equal to the sum of deviations from 1 of the individual risk ratio for each exposure separately.

The important thing to remember here is the additive interaction can be assessed based on risk ratios without the having underlying risks.

Relative Excess Risk Due to Interaction

So there is not interaction on the additive scale if:
image-1668445369884.png

The quantity, R11 - R10 - RR01 + 1, is called the relative excess risk due to interaction or RERI:
image-1668445478212.png

There is interaction on the additive scale if the RERI != 0
> 0 means positive additive interaction
< 0 means negative additive interaction

Multiplicative Scale

Suppose we have a table of risks similar to the additive risk above.

There is an interactive on the multiplicative scale if the effect of the two exposures together is not equal to the product of their individual effects:
image-1668445666784.png

image-1668445701312.png

It is possible to have both additive and multiplicative interaction, or a positive additive interaction and negative multiplicative interaction.

Case Control Studies

Additive Scale

If the outcome is rare, so that ORs estimate RRs, then we can say there is additive interaction if:
image-1668446012109.png

We can define the relative excess risk due to interaction as:
image-1668446064778.png

There is interaction on the additive scale if RERI != 0
> 0 means positive additive interaction
< 0 means negative additive interaction

Multiplicative Scale

In this setting multiplicative interaction exists when:
OR_11 != OR1_10 * OR_01

OR_11 / OR1_10 * OR_01
> 1 means positive multiplicative interaction
< 1 means negative multiplicative interaction

Note this is only based on odds ratios, there is no multiplicative interaction between risk ratios, unless the outcome is rare and ORs approximate RRs.

Additive vs Multiplicative Interaction

Modeling with Interaction

Hierarchical models always have the lower order terms before considering higher order terms

Ex. Hierarchical:
image-1668446321789.png

Ex. Non-Hierarchical
image-1668446355255.png

When there is no interaction term in a model the log(odds ratio) remains constant.
image-1668965077146.png
image-1668965223236.png

When there is an interaction term the log odds ratio is NOT constant. There is also interaction on the multiplicative scale.
image-1668965351231.png
image-1668965420872.png

Synergy: If beta_3 > 0 the joint effect of A and B is greater than the product of the individual effects.
Antagonism: If beta_3 < 0 the joint effect of A and B is less than the product of the individual effects.

The odds ratios (the group with neither exposure as the reference):
image-1668965666951.png

If there is an interaction term, the main effects cannot be interpreted alone, but are relative to the state of other variables.

R Code

library(effects) # To plot interactions
library(survival) # load survival package

# Hosmer-Lemshow Test
mod <- glm(y~x, family=binomial)
hl <- hoslem.test(mod$y, fitted(mod), g = 10)

Revision #10
Created 14 November 2022 14:53:55 by Elkip
Updated 20 November 2022 19:28:24 by Elkip