Skip to main content

Multiple Comparisons and Evaluating Significance

  • In 1978 Restricted Fragment Linked Polymorphisms (RFLPSs) were used for linkage analysis.
  • In 1987 the first human genetic map was created.
  • In 1989 microstellite markers made genome-wide linkage studies possible.
  • 1990-2003 the human genome project was sequenced.
  • 2002-2006 HapMap project collected sequences in populations to discover variation across the genome.
  • 2006 onward, Genome-Wide Association Studies (GWAS)
  • 2010 onward, large scale custom arrays
  • 2010 onward, sequencing technology becomes affordable
  • Even more WGS projects...
    • ADSP 2012
    • TOPMed 2014
    • CCDG 2014

Prior to the GWAS era, genetic association studies were hypothesis driven; Testing markers within/near the gene or region for association. "H0: The trait X is caused/influenced by Gene A." The hypothesis (gene or genes) came from:

  • Experiments in other species
  • Known associations with a related trait in humans
  • Linkage analysis localizing trait to a specific chromosomal region

Chip-based Genome-wide Association Scans

  • Hypothesis generating
    • Assumes only that there are genetic effects large enough to find
    • Asks what genes/variants are associated with my trait
  • 500k -> 5 million genes/variants across genome
    • Multiple genome-wide chips available
    • Varying strategies for SNP selection
    • Imputation allows testing of ungenotyped SNPs
    • Typically GWAS chips have focused on common SNPs with frequency > 1%
Candidate
  • Limits testing to locations of perceived high-prior-probability
  • "If you look under the lampostlamppost you can only see what it illuminates"
Genome-Wide
  • Extreme multiple testing - requires large sample size, meta-analysis of multiple studies to overcome
  • Gives an "unbiased" view of the genome
  • Allows unexpected discoveries

Whole Genome or Exome Sequencing

  • Identifies known SNPs (that would be on a chip) but also previously undiscovered variants.
  • Attempts to assay all, or nearly all, variation in genome or exome
    • Whole exome:
      • ~1% of the genome
      • ~30 million bp
      • Number of variants observed depends on sample size and population
    • Whole genome: 3 billion bp,  > 30 million known variants in 1000G project

Statistical Significance

There many things to test in genetic association studies:

  • Multiple phenotypes
  • Multiple SNPs
    • Candidate gene or region association
    • Genome-wide association
    • Haplotype Analyses
  • Gene-Gene or Gene-environmental Interactions

The multiple tests are often correlated.

Type I error: Null hypothesis of "no association" is rejected, when in fact the marker is NOT associated with that trait.
This implies research will spend a considerable amount of resources focusing on a gene or chromosomal region that is not truly important for your trait. 

Type II error: Null hypothesis of "no association" is NOT rejected, when in fact the trait and marker are associated.
This implies the chromosomal region/gene is discarded; a piece of tehthe genetic puzzle remains missing for now.

  • The significance level alpha for a single statistical test is the type-I error rate for that test.
  • If we perform multiple tests within the same study at level alpha, the type-I error rate specified will apply to each specific test but not to the entire experiment (unless some adjusted is made).
  • Probability of a type II error is beta.
  • Power = 1 - Beta

For a multiple testing problem with m tests:
image-1666377062984.png

Family-wise error rate (FWER) is the probability of at least one type I error.error; FWER = P(V > 0)

False discovery rate (FDR) is the expected proportion of type I errors among the rejected hypotheses.hypotheses; FDR = E(V/R)
    Assume V/R = 0 when R = 0

Procedures to Control FWER

The general strategy is to adjust p-value of each test for multiple testing; Then compare the adjusted p-values to alpha, so that FWER can be controlled at alpha.

Equivalently, determine the nominal p-value that is required achieve FWER alpha.

Sidák

Sidák adjusted p-value is based on the binomial distribution:

  • Each test is a trial. Under the null hypothesis, the probability of success is p, the significance level that is used
  • The probability of at least one success in m trials, each with probability p:

    image-1666377406424.png

  • For a test with p-value pi to adjust for m total tests, the adjusted p-value is pi* = 1 - (1 - pi)m
  • This is conservative (over-corrects) when the tests are not independent

Bonferroni

A simplification of Sidák:image-1666377583820.png

Bonferroni adjusted p-value: 

  • pi* = mpi
  • Over-coirrects (conservative) if the tests are correlated

Below are the individual p-values needed to reject for family-wise significance level=.05
image-1666378611019.png

minP

The probability that the minimum p-value from m tests is smaller than the observed p-value when ALL of the tests are NULL.
image-1666378838710.png

Equivelent to Sidak adjustment if all tests are independent. But for dependent tests, we don't know the distribution of the p-values under the null hypothesis, so we use permutation to determine the distribution.

Adjusted p-value is the probability that the minimum p-value in a resampled data set is smaller than the observed p-value.

This is less conservative than the above two methods, but the results are equal to Sidak when tests are significant.

Permutation is done under the assumption that the phenotype is independent of the genotypes; and phenotypes are permuted with respect to genotype.
Original
image-1666379458667.png
Permuted:
image-1666379478602.png
Genotypes from an individual are kept together to preserve LD

Permutation Procedure
  • Create 1000+ permuted data sets
    • Identical to the original except phenotype values have been assigned randomly
  • Analyze each in exactly the same manner as the original data set
  • Determine the minimum p-value from each permuted data set
    • 1000+ minimum p-values
  • The minP adjustment: the adjusted p-value is the proportion of minimum p-values that are smaller than the observed p-value.

Permutation is computationally expensive, and in some situations it is not possible at all (related individuals, meta-analysis results).

Alternative

Use the Bonferroni or Sidak correction with the "effective number of independent tests" instead of total number of tests. This reduces the number of tests to account for dependence among test statistics. We must approximate the equivalent number of independent tests.