Why does the univariate ROC analysis sometimes yield AUROC better than those obtained using multiple features?

The ROC curves created (using ROC Explorer or Tester) are based on cross-validation performance of multivariate algorithms (SVM, PLS-DA or Random Forests). In contrast, the classical univariate ROC curves are created based on the performance measured by testing all possible cutoffs within ALL data points.

Therefore, the AUROC from cross validated ROC curve is more realistic for prediction purpose, while the AUROC calculated from ROC curve created by univariate approach is often too optimistic (i.e. overfit). In other words, univariate ROC can be considered as an indicator of the discriminating “potential” of the feature, not its actual performance.

1 Like

One additional Question:
Does this mean that no ML-based classifiers such as PLSDA or SVM are used in the classical univariate version?
Or are univariate and multivariate classifiers only different in terms of validation?
However, I could not find any references to a PLSDA model or similar in the univariate analysis functions.

“Classical” means traditional approaches before ML & cross validation become popular

Keep in mind that MetaboAnalyst allows you to manually create biomarker model (the Tester path). To generate a cross-validated univariate ROC model, select a single feature of interest, use any of the models (SVM, PLS-DA, RF) to get the ROC curve.

1 Like