Which databases are used for SNP annotation?

General SNP annotation:

  • VEP: The Ensembl Variant Effect Predictor(VEP) is a powerful toolset for the analysis and annotation of genomic variants which provide access to an extensive collection of genomic annotation. Based on VEP annotation mapping to the latest GRCh38 human assembly, we provide a configurable flank ranging from 5000 to 50000 base pairs to search for overlaps between input variants and genomic features. User can either choose a specific distance or the nearest few genes to generate the manageable SNP-gene interaction network.

  • PhenoScanner: a curated database of publicly available results from large-scale genetic association studies in humans. It currently contain over over 150 million genetic variants and about 84 million associations with gene expression. Expression quantitative trait loci (eQTLs) analysis is important for understanding the effect of SNPs on gene expression. The original association database was calculated for all the pairs within 500 Kb, while to produce manageable results on our website, only results with P < 1 * 10-5 (suggested by the original study) are returned for queries SNPs.

SNP to miRNA variant annotation

  • ADmiRE: (Annotative Database of miRNA Elements) dedicate to miRNA variant annotation which combines multiple existing and new biological annotations of variation in miRNA genes across human datasets. 10,206 mature (3,257 within seed region) miRNA variants annotated from multiple large sequencing datasets such as gnomAD (15,496 genomes; 123,136 exomes) are included.

SNP to TF binding sites

  • SNP2TFBS: a database essentially provide specific annotations for human SNPs, namely whether they are predicted to abolish, create or change the affinity of one or several transcription factor (TF) binding sites. A SNP’s effect on TF binding is estimated based on a position weight matrix (PWM) model for the binding specificity of the corresponding factor. TFBSs from the JASPAR database are used to generate the SNP-TF associations. For each SNP it provides the list of TFBSs (PWMs) affected, sorted by the magnitude of the effects.