What is two-sample MR?

In two-sample MR (2SMR), the genetic IV-exposure risk factor associations and the genetic IV-outcome associations are ideally from different samples. It can use either individual participant data or GWAS summary data. We implemented 2SMR using summary statistics in mGWAS-Explorer.

The main workflow of 2SMR is as follows:

  • Select instruments (e.g., significant SNPs from GWAS) for the exposure(e.g., metabolites). Perform LD clumping if needed.
  • Extract the instruments from the outcomes of interest. Use LD proxies if the instrument SNPs are not available.
  • Harmonize the effect of the SNPs on the exposure, and the effect of the SNPs on the outcome are related to the same allele.
  • Perform MR analysis.

The data source of exposure GWAS is based on the significant SNP-metabolite statistical associations curated from 65 mGWAS studies, while the data for outcome GWAS is obtained by querying IEU OpenGWAS API.

LD clumping:
In LD clumping, only the SNPs with the lowest p-value are retained per locus based on linkage disequilibrium (LD). The purpose is to get independent signals.

LD proxies:
If a certain input SNP is absent from the outcome GWAS, an SNP (proxy) in LD with the input SNP will be searched instead.

References:
Hemani, Gibran et al. eLife vol. 7 e34408. 30 May. (2018)
Lawlor, Debbie A. International journal of epidemiology vol. 45,3 (2016)