There are two general approaches for multi-omics integration
-
Knowledge - driven integration: this type of integration is based on prior knowledge to link key features in different omics. For instance, the KEGG metabolic network is often used to connect key genes, proteins or metabolites obtained from different omics layers to help identify the “activated biological processes”. This type of analysis can be expanded to include other molecular interaction networks such as protein-protein interactions, TF-gene-miRNA interactions etc. We have developed OmicsNet and miRNet to support multi-omics integration based on comprehensive, high-quality molecular networks.
- The knowledge-based integration is mainly limited to model organisms where comprehensive knowledgebase exists. In addition, it is biased to existing knowledge with limited capacity for discovering novel relationships. This is where the data & model driven integration aims to address.
-
Data & model – driven integration: this type of integration applies various statistical models or machine learning algorithms to detect key features and patterns that co-vary across omics layers. In general, this type of integration is not confined to existing knowledge and is more suitable for novel discoveries.
- A key limitation of this type of integration is that there are no consensus approaches and a wide variety of methods have been developed over the past decade. Each method carries its own model assumption (or bias) and pitfalls. Properly using different methods and interpreting their results are the main challenges. This is one of the main motivations driving the development of OmicsAnalyst.