What Kinase-Substrate Interactions databases are available for phosphoproteomics kinase enrichment analysis?

The Kinase Enrichment Analysis (KEA) is accessible via the Phosphoproteomics module, this analysis identifies kinases whose targets are overrepresented within user dataset’s differentially phosphorylated sites. The system determines these associations by cross-referencing the data against both experimentally validated relationships and/or computational predictions.

The following databases and their methodologies are used to facilitate this analysis:

  1. GPS (Group-based Prediction System)
    Source: Predicted (Computational Algorithm)
    This database relies on a hierarchical clustering algorithm to predict kinase-specific phosphorylation sites. By grouping kinases based on structural similarity and training on known sequence patterns, it generates kinase-substrate predictions even for kinases with sparse experimental data (Chen et al., 2023).
  2. Phospho.ELM
    Source: Literature Curated (Experimentally Verified)
    This database consists of phosphorylation sites that have been manually extracted from scientific papers and validated by experiments. It focuses on validating Serine, Threonine, and Tyrosine modifications within eukaryotic proteins (Dinkel et al., 2011).
  3. PhosphoNetworks
    Source: Experimental (High-Throughput Protein Microarrays)
    This database is built on direct physical testing rather than literature mining. It uses protein microarrays where potential substrates are exposed to active kinases in vitro to identify direct physical phosphorylation events (Hu et al., 2014).
  4. PhosphoPICK
    Source: Predicted (Context-Aware)
    This database improves prediction accuracy by incorporating cellular context. It filters results using protein-protein interaction and gene expression data, downgrading kinase-substrate pairs that are unlikely to co-occur in vivo to help reduce false positives (Patrick et al., 2015).
  5. RegPhos
    Source: Integrated (Curated + Predicted)
    This database is designed to view phosphorylation as part of a larger system, combining experimental sites with metabolic and transcriptional pathway data. Uniquely, it employs prediction tools and subcellular localization filters to assign the most probable kinases to “orphan” phosphorylation sites (Huang et al., 2014).