# Matching

The aim of matching is remove confounding by matching subjects to be similar on a potential confounder. Doing so eliminates (or reduces) confounding, as well as reducing variability thereby increasing power.

Recall a paired t-test with two independent samples:  
[![image-1664806890708.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664806890708.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664806890708.png)  
with n-1 degrees of freedom and standard error:  
[![image-1664806918982.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664806918982.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664806918982.png)  
The test is inversely related to variance.

### Types of Matching

- Matched Pairs (covered today)
- Categorical Matching (unmatched analysis, stratified or regression) 
    - Stratify cases, then find equal number of controls for each category (or equal multiple).
- Caliper Matching 
    - Only for continuous variables
    - Similar to categorical but not the same
- Nearest Neighbor 
    - Select 'closest' control as match
    - May have minimum match criteria

#### Matching in Follow-Up Study

[![image-1664805506582.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664805506582.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664805506582.png)  
Remove Confounding (C) in the study sample between Exposed (E) and unexposed by matching on the potential confounders.

There are 4 possible combinations of outcomes in exposed and unexposed groups:  
[![image-1664807476717.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664807476717.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664807476717.png)  
Corncordant pairs have the same outcome between pairs, and opposite in discordant pairs.

An example presentation for matching 2x2 tables:  
[![image-1664807990551.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664807990551.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664807990551.png)  
[![image-1664807969489.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664807969489.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664807969489.png)  
Notice the column and cell totals now equal the value of cells a,b,c,and d in the original table.

In the paired follow-up study table we would calculate the Relative Risk Ratio:  
[![image-1664808244709.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664808244709.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664808244709.png)  
If we take the Risk Ratio of both the above tables, we find they are both the same (1.5).

Also, the confidence interval for RR in a follow up would be:  
[![image-1664808888412.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664808888412.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664808888412.png)

#### Matching in Case-Control Study

[![image-1664806210348.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664806210348.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664806210348.png)  
Remove Confounding (C) in the sample study between cases and controls by matching on potential confounders where for each case we select a control with the same values for the confounding variables.

For case control studies we set up our pairs differently:

[![image-1664808714918.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664808714918.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664808714918.png)  
And this also result in a different estimate of Odds Ratio:  
[![image-1664808739147.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664808739147.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664808739147.png)  
And likewise the confidence interval would be  
 OR x exp(+/- Z\*sqrt(1/b' + 1/c')

**ORs in Case-Control studies will not be the same if ignoring** **matching**. This is unlike the situation for RR, where the point estimate is the same whether considering the matching or not.

#### The McNemar Test  


The McNemar Test is a non-parametric test for paired nominal data. It is a chi-square distribution and can be used for retrospective case-control or follow-up studies. It assumes:

- The two groups are mutually exclusive
- A random sample

H0: The proportion of some disease is the same in participants with exposure and those without exposure (RR=1)  
Ha: The proportion of some disease is not the same in participants with exposure and those without exposure (RR != 1)

[![image-1664808475923.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664808475923.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664808475923.png) with df = 1

### Matched Analyses and Mantel-Haenszel

Mantel-Haenszel methods applied to strata established by matched sets are equivalent to the conventional matched methods. Works for Cohort (Follow-Up) and case-control studies. When the matched pair is a stratum, we carry out the MH method on each pair.

So from the example above, if we calculated 50 strata for 50 matched pairs, we would end up with 50 tables:

[![image-1664810885032.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664810885032.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664810885032.png)  
And then we apply the mRR and MH to each of the 50 strata to obtain the McNemar result.

[![image-1664810988685.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664810988685.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664810988685.png)

[![image-1664811005581.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664811005581.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664811005581.png)

### Testing Interaction

It is possible to test interaction between a matching variable and exposure (whether a matching variable modified the effect of E on D).

We obtain an OR for each subgroup of the interaction variable and conduct a formal test to see whether the two ORs are equal.

### R-1 Matching

Match one index to R referent subjects. We create a strata for every matched set. A stratum includes a matched set of R + 1 subjects.

[![image-1664813016748.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664813016748.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664813016748.png)

In a case control study:

- Index = case
- Referent = control
- Stratum:  
    [![image-1664812884849.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664812884849.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664812884849.png)
- Exposure status:  
    [![image-1664812922596.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664812922596.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664812922596.png)
- We can then use the following formulas for estimate of OR and chi-squared:  
    [![image-1664813185420.png](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/scaled-1680-/image-1664813185420.png)](https://bookstack.mitchellhenschel.com/uploads/images/gallery/2022-10/image-1664813185420.png)

McNemar's test cannot be used in R-1 matching.

### When to Match?

- How hard is it to obtain subjects?
- How hard is it to match?
- How much does matching buy? 
    - How influential are the confounders? How stronly correlated are the matched pairs?
    - Is variation within the pairs small relative to variation between pairs?
- What do we gain? 
    - Eliminate/reduce confounding
    - Reduce variability and increase power

##### Issues with Matching

- Matched studies use restriction sampling
- The exposed and the cases will not typically represent the general population 
    - Ex. a case-control study of smoking and lung cancer
- The cases will on average be more male and older than the general population and also may not reflect racial distribution
- We may not be able to generalize results

## Code

```R
### Matched pairs can be analyzed using the Mantel-Haenszel method. 
### Each matched pair is treated as a separate stratum
mantelhaen.test(pairs$exposed, pairs$diseased, pairs$match, correct=FALSE)

# table of pairs: 
table(exposed.d , unexposed.d)

### Analyze data with mcnemar.test()
mcnemar.test(table(exposed.d , unexposed.d), correct=FALSE) 
mcnemar.test(table(exposed.d , unexposed.d))

### Matched sets can be analyzed using the Mantel-Haenszel method. 
### Each matched set is treated as a separate stratum 
mantelhaen.test(Rto1$smoke, Rto1$casecon, Rto1$casenum, correct=FALSE)
mantelhaen.test(Rto1$smoke, Rto1$casecon, Rto1$casenum)
```