I have a few questions regarding performance of LEfSe analysis on the datasets that have more than 2 classes/groups of features.
Does MicrobiomeAnalyst offer a strict version of LEfSe?
I noticed in my analyses that some featuers were identified as biomarkers of one class but did not differ statistically from all the other groups, just from one or a few. According to the original paper on LEfSe ( Metagenomic biomarker discovery and explanation - PMC (nih.gov), this is a non-strict version of LEfSe, i.e., it determines the biomarkers that distinguish at least one individual class. How do I perform a strict strategy of LEfSe in the MicrobiomeAnalyst, i.e., identify bacterial features that are statistically different from all classes within a multi-class dataset, and obtain a detail report on the p-values, FDR-corrected p-values, LDA scores, etc.?
MicrobiomeAnalyst offers the Benjamini-Hochberg method of correction of the p-values for multiple testing; however, LEfSe algorithm is sequential, meaning it runs several tests where each next test depends on the results of the previous one. Are FDR-corrected p-values used in the Kruskal-Wallis test, Wilcoxon pairwise tests, and the final LDA tests or only in the latter LDA one?
Sometimes the names of the bacteria in the bar and dot plots of LEfSe results are cut in width, other times—either the heatmap on the right of the dot plot is cropped, or the number of different colors in the bar plot is limited to only a few in cases of the datasets with many classes. Is there a way to increase the plot parameters for visualization of LEfSe results?
I would appreciate all the information on the topic that you can provide!