Multiple Comparisons

There are some situations where it may be necessary to have multiple hypothesis tests; ANOVA with more than 2 tables, genetic data, interim analysis, multiple outcomes, etc. Often times clinical trials may have 3 or more arms to reduce administrative burden and improve efficiency and comparability.

Recall hypothesis tests are a way to determine the truth about 2 states and 2 possible outcomes
image.png
α = probability of a Type 1 error; β = probability of a Type 2 error; 1 - β = power

Assume we carry out m independent statistical tests with significance level α, this means the probability of not making a Type 1 error in any test is: (1-α)*(1-α)*(1-α)*...*(1-α)=(1-α)m

Multiplicity may occur when we use more complex designs, such as 3 or more treatment groups, multiple outcomes, or repeated measurements on the same outcome.

Types of Error Rates

Analytic Strategies

Adjusting for Multiplicity

Fisher's Least Significant Differences

We complete the global ANOVA first, if it rejected we simply complete the pairwise comparisons and do not correct the p-values. Easiest method, but this requires the global ANOVA is rejected. The FWER is only controlled when all null hypotheses are true.


proc glm data=headache;
class group;
model outcome=group;
lsmeans group / tdiff pdiff stderr cl;
* tdiff = t-statistics and p-values for pairwise tests;
* pdiff = p-values for pairwise tests;
* stderr = standard errors for means;
* cl = confidence limits;
run;quit;

image.png
This output suggests we reject the null hypothesis and conclude the mean is different in at least one group. Thus we can do the rest of the pairwise comparisons:
image.png

P-Value (Single Step) Adjustments

To correct the comparison-wise alpha level to allow the family-wise comparison level to be controlled at .05. For example, there are two ways to implement the Bonferroni correction:

* Bonferroni correction;
proc glm data=headache;
class group;
model outcome=group;
lsmeans group / tdiff pdiff stderr cl adjust=bon;
run;
quit;

* We can also use Bonferroni correction
with a control group;
proc glm data=headache;
class group;
model outcome=group;
lsmeans group / tdiff
pdiff=control(‘Placebo’) stderr cl
adjust=bon;
run;
quit;

* Tukey-Kramer correction;
proc glm data=headache;
class group;
model outcome=group;
lsmeans group / tdiff pdiff stderr cl adjust=tukey;
run;
quit;

* Dunnett;
proc glm data=headache;
class group;
model outcome=group;
lsmeans group / tdiff
pdiff=control(‘Placebo’) stderr cl
adjust=dunnett;
run;
quit;

The Dunnett's test takes advantage of correlations among test statistics, generally less conservative than Bonferroni (lower Type 2 error rate).

Step-Wise Adjustments
data pvals;
input test $ raw_p @@;
cards;
AvP 0.0002 NvP 0.0001 NvA 0.025
run;
proc multtest pdata=pvals bonferroni holm out=adjp;
run;

The FWER is controlled because a hypothesis is tested conditionally on having rejected all the hypotheses that came previously.


Revision #2
Created 24 February 2023 15:03:37 by Elkip
Updated 24 February 2023 16:38:43 by Elkip