What is two-sample MR?

le_chang · October 31, 2022, 3:58pm

In two-sample MR (2SMR), the genetic IV-exposure risk factor associations and the genetic IV-outcome associations are ideally from different samples. It can use either individual participant data or GWAS summary data. We implemented 2SMR using summary statistics in mGWAS-Explorer.

The main workflow of 2SMR is as follows:

Select instruments (e.g., significant SNPs from GWAS) for the exposure(e.g., metabolites). Perform LD clumping if needed.
Extract the instruments from the outcomes of interest. Use LD proxies if the instrument SNPs are not available.
Harmonize the effect of the SNPs on the exposure, and the effect of the SNPs on the outcome are related to the same allele.
Perform MR analysis.

The data source of exposure GWAS is based on the significant SNP-metabolite statistical associations curated from 65 mGWAS studies, while the data for outcome GWAS is obtained by querying IEU OpenGWAS API.

LD clumping:
In LD clumping, only the SNPs with the lowest p-value are retained per locus based on linkage disequilibrium (LD). The purpose is to get independent signals.

LD proxies:
If a certain input SNP is absent from the outcome GWAS, an SNP (proxy) in LD with the input SNP will be searched instead.

References:
Hemani, Gibran et al. eLife vol. 7 e34408. 30 May. (2018)
Lawlor, Debbie A. International journal of epidemiology vol. 45,3 (2016)