Stratification and Interaction
Which Summary Measure to Use?
- Weighted averages are usually best
- Mantel-Haenszel is easy to compute and can handle zeros
- MLE measures are difficult and typically require a computer
Weighted Average in MH Summaries
Consider the following table:
Sample 1 |
Sample 2 |
|
n |
30 |
70 |
x_bar |
5 |
8 |
Weighted average of population -> ((30*5)+(70*8))/(30+70) = 7.1
The average mean is closer to the cohort with a larger sample size. We can calculate any weighted average with the general form:
Where theta_hat is an estimator, such as mean or OR.
The MH Odds Ratio and RR can be described as weighted averages:
Where the weights are (b*c)/n
Where (a/n_1) / (b/n_0) is the risk ratio in each stratum, (b*n_1 / n) is the weight
Assumptions of Mantel-Haenszel Summary Measures
- Observations are independent from each other
- All observations are identically distributed
- The common effect assumption should hold:
- Follow-up cohort study - The stratum-specific risk ratios are all equal across the strata
- Case-control - The stratum specific odds ratios are all equal across the strata
MH measures are biased if the correctness of the common effect assumptions cannot be justified.
An extreme example: When interaction exists with protective and detrimental effects across strata; Protective effects negative in numerator in a stratum, and detrimental effects positive in numerator in another stratum.
Precision-based Summary Estimators
Also called Woolf's Method. Precision-based summary estimators are also weighted averages. Weighing each stratum according to its sampling error gives the most weight to the strata with the smallest variance. Precision-based are designed to have the greatest precision (smallest standard error). For Ratios we often take the log scale for a more symmetrical distribution. The general approach:
This is the sum of the products of each stratum-specific ratio times its weight, all divided by the sum of weights.
Precision-based Summary Odds Ratio
Thus, Var(ln(OR_hat) ~ 1/a + 1/b + 1/c + 1/d
And for CI:
Precision-based Summary Risk Ratio
Thus the Var(ln(RR_hat)) = ((1-p_hat1)/(n_1*p_hat1) + (1 - p_hat2)/(n_2*p_hat2))
And for CI: