How to avoid filtering during data normalization

MetaboAnalyst website aims to provide the best practice for metabolomics data analysis.
I assume your data is LC-MS untargeted metabolomics which contains a high proportion of noise,

  • If you choose no-filtering, you will have top 5000 features, not 2500. After minimal data cleaning (blank subtraction and removing low repeatability peaks based on QC), the peaks are already below this number in most time based on our own experience.
  • Please read this post on data filtering for PCA
  • Our recent benchmark study shows that GSEA does not perform well, and ~30% annotation rate can achieve high recall (~90%) on pathway activity prediction using mummichog

For R package installation, have you tried to follow our instructions, and what are the issues?