Matching

The aim of matching is remove confounding by matching subjects to be similar on a potential confounder. Doing so eliminates (or reduces) confounding, as well as reducing variability thereby increasing power. 

 Recall a paired t-test with two independent samples: with n-1 degrees of freedom and standard error: The test is inversely related to variance. 

 Types of Matching 

 

 Matched Pairs (covered today) 

 Categorical Matching (unmatched analysis, stratified or regression)

 

 Stratify cases, then find equal number of controls for each category (or equal multiple). 

 

 

 Caliper Matching

 

 Only for continuous variables 

 Similar to categorical but not the same 

 

 

 Nearest Neighbor

 

 Select 'closest' control as match 

 May have minimum match criteria 

 

 

 

 Matching in Follow-Up Study 

 Remove Confounding (C) in the study sample between Exposed (E) and unexposed by matching on the potential confounders. 

 There are 4 possible combinations of outcomes in exposed and unexposed groups: Corncordant pairs have the same outcome between pairs, and opposite in discordant pairs. 

 An example presentation for matching 2x2 tables: Notice the column and cell totals now equal the value of cells a,b,c,and d in the original table. 

 In the paired follow-up study table we would calculate the Relative Risk Ratio: If we take the Risk Ratio of both the above tables, we find they are both the same (1.5). 

 Also, the confidence interval for RR in a follow up would be: 

 Matching in Case-Control Study 

 Remove Confounding (C) in the sample study between cases and controls by matching on potential confounders where for each case we select a control with the same values for the confounding variables. 

 For case control studies we set up our pairs differently: 

 And this also result in a different estimate of Odds Ratio: And likewise the confidence interval would be 

 OR x exp(+/- Z*sqrt(1/b' + 1/c') 

 ORs in Case-Control studies will not be the same if ignoring matching . This is unlike the situation for RR, where the point estimate is the same whether considering the matching or not. 

 The McNemar Test 

 The McNemar Test is a non-parametric test for paired nominal data. It is a chi-square distribution and can be used for retrospective case-control or follow-up studies. It assumes: 

 

 The two groups are mutually exclusive 

 A random sample 

 

 H0: The proportion of some disease is the same in participants with exposure and those without exposure (RR=1) Ha: The proportion of some disease is  not the same in participants with exposure and those without exposure (RR != 1) 

   with df = 1 

 Matched Analyses and Mantel-Haenszel 

 Mantel-Haenszel methods applied to strata established by matched sets are equivalent to the conventional matched methods. Works for Cohort (Follow-Up) and case-control studies. When the matched pair is a stratum, we carry out the MH method on each pair. 

 So from the example above, if we calculated 50 strata for 50 matched pairs, we would end up with 50 tables: 

 And then we apply the mRR and MH to each of the 50 strata to obtain the McNemar result. 

 

 

 Testing Interaction 

 It is possible to test interaction between a matching variable and exposure (whether a matching variable modified the effect of E on D). 

 We obtain an OR for each subgroup of the interaction variable and conduct a formal test to see whether the two ORs are equal. 

 R-1 Matching 

 Match one index to R referent subjects. We create a strata for every matched set. A stratum includes a matched set of R + 1 subjects. 

 

 In a case control study: 

 

 Index = case 

 Referent = control 

 Stratum: 

 

 

 Exposure status: 

 

 

 We can then use the following formulas for estimate of OR and chi-squared: 

 

 

 

 McNemar's test cannot be used in R-1 matching. 

 When to Match? 

 

 How hard is it to obtain subjects? 

 How hard is it to match? 

 How much does matching buy?

 

 How influential are the confounders? How stronly correlated are the matched pairs? 

 Is variation within the pairs small relative to variation between pairs? 

 

 

 What do we gain?

 

 Eliminate/reduce confounding 

 Reduce variability and increase power 

 

 

 

 Issues with Matching 

 

 Matched studies use restriction sampling 

 The exposed and the cases will not typically represent the general population

 

 Ex. a case-control study of smoking and lung cancer 

 

 

 The cases will on average be more male and older than the general population and also may not reflect racial distribution 

 We may not be able to generalize results 

 

 Code 

 ### Matched pairs can be analyzed using the Mantel-Haenszel method. 

### Each matched pair is treated as a separate stratum

mantelhaen.test(pairs$exposed, pairs$diseased, pairs$match, correct=FALSE)

# table of pairs: 

table(exposed.d , unexposed.d)

### Analyze data with mcnemar.test()

mcnemar.test(table(exposed.d , unexposed.d), correct=FALSE) 

mcnemar.test(table(exposed.d , unexposed.d))

### Matched sets can be analyzed using the Mantel-Haenszel method. 

### Each matched set is treated as a separate stratum 

mantelhaen.test(Rto1$smoke, Rto1$casecon, Rto1$casenum, correct=FALSE)

mantelhaen.test(Rto1$smoke, Rto1$casecon, Rto1$casenum)