Interim Analysis and Data Monitoring

Clinical trials are often longitudinal in nature. It is often impossible to enroll all subjects at the same time, so it can take a long time to complete a longitudinal study. Over the course of the trial one needs to consider administrative monitoring, safety monitoring and efficacy monitoring.

Efficacy monitoring can be performed by taking interim looks at the primary endpoint data (prior to all subjects being enrolled or all subjects completing treatment). This is because:

Interim analysis evaluates for early efficacy, early futility, safety concerns, or adaptive design with respect to sample size or power.

Group Sequential Design

A common type of study design for interim analysis is GSD, in which data are analyzed at regular intervals.

Due to multiple testing the probability of observing at least one significant interim result is much greater than the overall α = .05, as a result the interim analyses should NOT be performed using the family-wise error rate. The data at each interim analysis contains data from the previous interim and thus are not independent.

Equivalently we would have K critical values for each interim:

Pocock Approach (1977)

Derives constant critical values across all stages to maintain the overall significance level at .05. The critical value depends on the number of interim analyses, but is the same for each interim look.

Ex. When K=5, Z critical value = 2.413 for each interim and the final analysis. When K=4, Z critical value = 2.361.

O'Brien-Fleming Approach (1979)

Proposed sequential testing procedure that has critical values (in absolute value) decrease over the stages. Z critical value depends on total number of interim analyses and the stage of the interim analysis. The critical Z value depends on the total number of interim analyses and the stage of the interim analysis.

image.png

Ex. For K = 5 after after 200 subjects completed:

image.png

This makes it more difficult to declare superiority at "earlier" looks, but does lose much of the original alpha at the final look. This is more conservative than Pocock and the recommended approach by the FDA on their 2010 Guidance on Adaptive Designs.

Controlling the Overall Significance Level

Issues with group sequential procedures:

Alpha-Spending

From Lan and DeMets (1983) Biometrika: Adjust teh levels via an "alpha-spending function". Think of it like each analysis spends a bit of the alpha power.

This can work in conjunction with O'Brien-Fleming.

Interpretation:

Focus on the significance levels, note the the alpha spending and significance level are two distinct values and not equal. Computing critical values and corresponding significant levels require knowledge of multivariate distributions. There is no "simple" equation to get from α(s) to Z. Typically obtained via numerical integration.

/*
	plots=boundary - graph observed stanardized test statistics 
    	at each interim
    errspend - give overall error spending
    bscale=pvalue - instead of critical values print p-values
    info=equal - equal intervals
    stop=reject - trying to reject the null (default)
*/
proc seqdesign errspend plots=boundary;
TwoSidedObrienFleming: design nstages=5 alpha=0.05
alt=twosided info=equal method=errfuncobf
stop=reject;
run;

image.png image.png

Notes that the bscale=pvalue option gives one tail of the distribution, so to get the significance level we need to multiply the upper or lower boundary by 2.

image.png

Pocock Approach in SAS
?* Pocock alpha-spending for a two-sided test with 
	two-sided alpha spent of .05 by final analysis */
proc seqdesign errspend bscale=pvalue;
TwoSidedPocock: design nstages=5 alpha=0.05
alt=twosided info=equal method=poc stop=reject;
run;

image.png

Interim Analyses For Safety

It is not necessarily easy from an administrative and study conduct perspective. We need to determine if data is still recent enough to be included. Weeks or months may pass between the last subject visit and generation of interim results due to data entry and cleaning. Depending on the size of the study, the ideal goal is to have a < 60-day lag between data collected at sites and the interim analysis report. Otherwise, interim analysis be be obsolete by the time analysis is completed.

Inspection of adverse events and serious adverse events is primary concern. Labs, vital signs, etc. need to be inspected. Unlike efficacy there is often no formal stopping rules based on p-values or a parametric test. If it is felt there is a safety concern, the study may be stopped regardless of significance between treatments.

The results of the interim analysis are also inspected by a Data Safety and Monitoring Board (DSMB), sometimes called the Data Monitoring Committee. Usually these consist of:

The sponsor will often hire an outside group (Contract Research Organization or CRO) to perform the interim analyses. The statistician at CRO has a randomization schedule, and the analysis group cannot divulge ANY information to the sponsor or to any personnel involved in the study; It is presented to the  independent Data Safety Monitoring Board.

 


Revision #4
Created 31 March 2023 14:12:02 by Elkip
Updated 31 March 2023 16:07:53 by Elkip