Matching

The aim of matching is remove confounding by matching subjects to be similar on a potential confounder. Doing so eliminates (or reduces) confounding, as well as reducing variability thereby increasing power.

Recall a paired t-test with two independent samples:

with n-1 degrees of freedom and standard error:

The test is inversely related to variance.

Types of Matching

Matched Pairs (covered today)
Categorical Matching (unmatched analysis, stratified or regression)
- Stratify cases, then find equal number of controls for each category (or equal multiple).
Caliper Matching
- Only for continuous variables
- Similar to categorical but not the same
Nearest Neighbor
- Select 'closest' control as match
- May have minimum match criteria

Matching in Follow-Up Study

Remove Confounding (C) in the study sample between Exposed (E) and unexposed by matching on the potential confounders.

There are 4 possible combinations of outcomes in exposed and unexposed groups:

Corncordant pairs have the same outcome between pairs, and opposite in discordant pairs.

An example presentation for matching 2x2 tables:

Notice the column and cell totals now equal the value of cells a,b,c,and d in the original table.

In the paired follow-up study table we would calculate the Relative Risk Ratio:

If we take the Risk Ratio of both the above tables, we find they are both the same (1.5).

Also, the confidence interval for RR in a follow up would be:

Matching in Case-Control Study

Remove Confounding (C) in the sample study between cases and controls by matching on potential confounders where for each case we select a control with the same values for the confounding variables.

For case control studies we set up our pairs differently:

And this also result in a different estimate of Odds Ratio:

And likewise the confidence interval would be
OR x exp(+/- Z*sqrt(1/b' + 1/c')

ORs in Case-Control studies will not be the same if ignoring matching. This is unlike the situation for RR, where the point estimate is the same whether considering the matching or not.

The McNemar Test

The McNemar Test is a non-parametric test for paired nominal data. It is a chi-square distribution and can be used for retrospective case-control or follow-up studies. It assumes:

The two groups are mutually exclusive
A random sample

H0: The proportion of some disease is the same in participants with exposure and those without exposure (RR=1)
Ha: The proportion of some disease is not the same in participants with exposure and those without exposure (RR != 1)

with df = 1

Matched Analyses and Mantel-Haenszel

Mantel-Haenszel methods applied to strata established by matched sets are equivalent to the conventional matched methods. Works for Cohort (Follow-Up) and case-control studies. When the matched pair is a stratum, we carry out the MH method on each pair.

So from the example above, if we calculated 50 strata for 50 matched pairs, we would end up with 50 tables:

And then we apply the mRR and MH to each of the 50 strata to obtain the McNemar result.

Testing Interaction

It is possible to test interaction between a matching variable and exposure (whether a matching variable modified the effect of E on D).

We obtain an OR for each subgroup of the interaction variable and conduct a formal test to see whether the two ORs are equal.

R-1 Matching

Match one index to R referent subjects. We create a strata for every matched set. A stratum includes a matched set of R + 1 subjects.

In a case control study:

Index = case
Referent = control
Stratum:
Exposure status:
We can then use the following formulas for estimate of OR and chi-squared:

McNemar's test cannot be used in R-1 matching.

When to Match?

How hard is it to obtain subjects?
How hard is it to match?
How much does matching buy?
- How influential are the confounders? How stronly correlated are the matched pairs?
- Is variation within the pairs small relative to variation between pairs?
What do we gain?
- Eliminate/reduce confounding
- Reduce variability and increase power

Issues with Matching

Matched studies use restriction sampling
The exposed and the cases will not typically represent the general population
- Ex. a case-control study of smoking and lung cancer
The cases will on average be more male and older than the general population and also may not reflect racial distribution
We may not be able to generalize results

Code

### Matched pairs can be analyzed using the Mantel-Haenszel method. 
### Each matched pair is treated as a separate stratum
mantelhaen.test(pairs$exposed, pairs$diseased, pairs$match, correct=FALSE)

# table of pairs: 
table(exposed.d , unexposed.d)

### Analyze data with mcnemar.test()
mcnemar.test(table(exposed.d , unexposed.d), correct=FALSE) 
mcnemar.test(table(exposed.d , unexposed.d))

### Matched sets can be analyzed using the Mantel-Haenszel method. 
### Each matched set is treated as a separate stratum 
mantelhaen.test(Rto1$smoke, Rto1$casecon, Rto1$casenum, correct=FALSE)
mantelhaen.test(Rto1$smoke, Rto1$casecon, Rto1$casenum)