Does MetaboAnalyst use training and testing data in their PLS-DA analysis?


For many demonstrations of PLS-DA analyses I see online, they show models initially being created with training data which are then used to fit test data. I found some information about how MetaboAnalyst performs their PLS-DA here, though I would like to clarify if the data is split this way at all? Especially for the scores , loadings, and important features sections?

Many thanks!

They are different concepts and for different application purposes.

Cross validation is a way to detect overfitting in classification tasks (together with Permutation). Its main application is when you would like to use the PLS-DA model for classification tasks. In MetaboAnalyst, its outputs include Q2 and Accuracy. For permutation, it is empirical p value

Scores and loading are visualization techniques to help understand the top components identified in PLS-DA model. Although it is possible to do scores and loading is each CV. It is rarely useful in this case.

Finally, classification task requires a large number of samples. When you have very few samples, say, less than 12, I would suggest to focus on simpler methods, i.e. t-tests/ANOVA, PCA, heatmaps, etc

1 Like

Thanks so much for getting back to me and for your advice!

This topic was automatically closed after 3 days. New replies are no longer allowed.