In two-sample MR (2SMR), the genetic IV-exposure risk factor associations and the genetic IV-outcome associations are ideally from different samples. It can use either individual participant data or GWAS summary data. We implemented 2SMR using summary statistics in mGWAS-Explorer.
The main workflow of 2SMR is as follows:
- Select instruments (e.g., significant SNPs from GWAS) for the exposure(e.g., metabolites). Perform LD clumping if needed.
- Extract the instruments from the outcomes of interest. Use LD proxies if the instrument SNPs are not available.
- Harmonize the effect of the SNPs on the exposure, and the effect of the SNPs on the outcome are related to the same allele.
- Perform MR analysis.
The data source of exposure GWAS is based on the significant SNP-metabolite statistical associations curated from 65 mGWAS studies, while the data for outcome GWAS is obtained by querying IEU OpenGWAS API.
LD clumping:
In LD clumping, only the SNPs with the lowest p-value are retained per locus based on linkage disequilibrium (LD). The purpose is to get independent signals.
LD proxies:
If a certain input SNP is absent from the outcome GWAS, an SNP (proxy) in LD with the input SNP will be searched instead.
References:
Hemani, Gibran et al. eLife vol. 7 e34408. 30 May. (2018)
Lawlor, Debbie A. International journal of epidemiology vol. 45,3 (2016)