What is allele harmonization?

Allele harmonization ensures that a SNP’s effect on the exposure and its effect on the outcome must correspond to the same allele. This is challenging with palindromic SNPs.

There are three options to harmonize the data:

  1. Assume all alleles are presented on the forward strand: This option presumes that the alleles for all SNPs are presented as they would appear on the ‘forward’ DNA strand (also known as the ‘plus’ strand). Using this approach, you can avoid confusion arising from the different conventions for representing SNP alleles.
  2. Infer the forward strand alleles using allele frequency information: Allele frequency information can be used to infer the correct forward strand alleles. This option makes use of this information to correct any potential strand inconsistencies.
  3. Correct the strand for non-palindromic SNPs, but drop all palindromic SNPs: Palindromic SNPs are those where the pair of possible alleles are the same when read in the 5’ to 3’ direction on either DNA strand (for example, A/T or C/G). Because of this, it can be impossible to determine the correct strand orientation based solely on the allele. In this option, strand correction is performed for non-palindromic SNPs (those that are not the same when read in either direction), while all palindromic SNPs are excluded from the analysis to avoid potential confusion and errors.