How does MetaboAnalyst deal with over-fitting in biomarker analysis?

In multivariate exploratory ROC analysis, MetaboAnalyst uses repeated, balanced sub-sampling cross validation (CV) to test the performance of models created with different number of features. At each CV, 2/3 samples are used for feature selection and model training and the remaining 1/3 of samples are used for testing. The procedures are repeated 50 times in order to produce a more stable estimation.

In ROC Tester, MetaboAnalyst also provides permutation tests to calculate the significance of the biomarker model by comparing with those obtained based on data with shuffled group labels.

Please note, there are no CV nor permutation tests used in classical univariate ROC curve analysis - the AUROC is mainly for biomarker potential (i.e. ranking), and should not be used as prediction performance.