Skip to main content

Matching

The aim of matching is remove confounding by matching subjects to be similar on a potential confounder. Doing so eliminates (or reduces) confounding, as well as reducing variability thereby increasing power.

Recall a paired t-test with two independent samples:
image-1664806890708.png
with n-1 degrees of freedom and standard error:
image-1664806918982.png
The test is inversely related to variance.

Types of Matching

  • Matched Pairs (covered today)
  • Categorical Matching (unmatched analysis, stratified or regression)
    • Stratify cases, then find equal number of controls for each category (or equal multiple).
  • Caliper Matching
    • Only for continuous variables
    • Similar to categorical but not the same
  • Nearest Neighbor
    • Select 'closest' control as match
    • May have minimum match criteria

Matching in Follow-Up Study

image-1664805506582.png
Remove Confounding (C) in the study sample between Exposed (E) and unexposed by matching on the potential confounders.

There are 4 possible combinations of outcomes in exposed and unexposed groups:
image-1664807476717.png
Corncordant pairs have the same outcome between pairs, and opposite in discordant pairs.

An example presentation for matching 2x2 tables:
image-1664807990551.png
image-1664807969489.png
Notice the column and cell totals now equal the value of cells a,b,c,and d in the original table.

In the paired follow-up study table we would calculate the Relative Risk Ratio:
image-1664808244709.png
If we take the Risk Ratio of both the above tables, we find they are both the same (1.5).

Also, the confidence interval for RR in a follow up would be:
image-1664808888412.png

Matching in Case-Control Study

image-1664806210348.png
Remove Confounding (C) in the sample study between cases and controls by matching on potential confounders where for each case we select a control with the same values for the confounding variables.

For case control studies we set up our pairs differently:

image-1664808714918.png
And this also result in a different estimate of Odds Ratio:
image-1664808739147.png
And likewise the confidence interval would be
    OR x exp(+/- Z*sqrt(1/b' + 1/c')

ORs in Case-Control studies will not be the same if ignoring matching. This is unlike the situation for RR, where the point estimate is the same whether considering the matching or not.

The McNemar Test

The McNemar Test is a non-parametric test for paired nominal data. It is a chi-square distribution and can be used for retrospective case-control or follow-up studies. It assumes:

  • The two groups are mutually exclusive
  • A random sample

H0: The proportion of some disease is the same in participants with exposure and those without exposure (RR=1)
Ha: The proportion of some disease is  not the same in participants with exposure and those without exposure (RR != 1)

image-1664808475923.png   with df = 1

Matched Analyses and Mantel-Haenszel

Mantel-Haenszel methods applied to strata established by matched sets are equivalent to the conventional matched methods. Works for Cohort (Follow-Up) and case-control studies. When the matched pair is a stratum, we carry out the MH method on each pair.

So from the example above, if we calculated 50 strata for 50 matched pairs, we would end up with 50 tables:

image-1664810885032.png
And then we apply the mRR and MH to each of the 50 strata to obtain the McNemar result.

image-1664810988685.png

image-1664811005581.png

Testing Interaction

It is possible to test interaction between a matching variable and exposure (whether a matching variable modified the effect of E on D).

We obtain an OR for each subgroup of the interaction variable and conduct a formal test to see whether the two ORs are equal.

R-1 Matching

Match one index to R referent subjects. We create a strata for every matched set. A stratum includes a matched set of R + 1 subjects.

image-1664813016748.png

In a case control study:

  • Index = case
  • Referent = control
  • Stratum:

    image-1664812884849.png

  • Exposure status:

    image-1664812922596.png

  • We can then use the following formulas for estimate of OR and chi-squared:

    image-1664813185420.png

McNemar's test cannot be used in R-1 matching.

When to Match?

  • How hard is it to obtain subjects?
  • How hard is it to match?
  • How much does matching buy?
    • How influential are the confounders? How stronly correlated are the matched pairs?
    • Is variation within the pairs small relative to variation between pairs?
  • What do we gain?
    • Eliminate/reduce confounding
    • Reduce variability and increase power
Issues with Matching
  • Matched studies use restriction sampling
  • The exposed and the cases will not typically represent the general population
    • Ex. a case-control study of smoking and lung cancer
  • The cases will on average be more male and older than the general population and also may not reflect racial distribution
  • We may not be able to generalize results

Code

### Matched pairs can be analyzed using the Mantel-Haenszel method. 
### Each matched pair is treated as a separate stratum
mantelhaen.test(pairs$exposed, pairs$diseased, pairs$match, correct=FALSE)

# table of pairs: 
table(exposed.d , unexposed.d)

### Analyze data with mcnemar.test()
mcnemar.test(table(exposed.d , unexposed.d), correct=FALSE) 
mcnemar.test(table(exposed.d , unexposed.d))

### Matched sets can be analyzed using the Mantel-Haenszel method. 
### Each matched set is treated as a separate stratum 
mantelhaen.test(Rto1$smoke, Rto1$casecon, Rto1$casenum, correct=FALSE)
mantelhaen.test(Rto1$smoke, Rto1$casecon, Rto1$casenum)