The statistics tools don't take the output from the initial data QC steps

Hi Jeff

thanks for the remind, and apologies not following the guid in the first place.

Bug report: initial data processing (such as filtering, normalisation, transformation) doesn’t pass to the follow-up Statistics steps

platform: the web-based MetaboAnalyst6.0

Module: Statistical Analysis [one factor]

data type: the test data MS peak lists

data processing
all went normal
data uploading normal
data filtering finished normal with pop-up saying 407 features with 37 and 41 filtered, 331 remaining
- Low-variance filter 10%
- Low-abundance filter 10%
Normalization normal
- Normalization by a pooled sample from group (group PQN) to WT
- Log transformation (base 2)
- no Data scaling

Statistics
but when it came to the stats step such as T-test, PCA, folder change analysis, you still got the analysis on all original 407 features, not the filtered 331 features. not sure if the normalisation and scaling had worked, but the filtering obviously hadn’t.

could you please look into this issue? many thanks

Cheers
Wenzhe

Please follow our post guide?

1 Like

Hi Jeff, thanks for the reminder. please see the revised content in the original post. :pray:

Everything is working as expected in my testing. Did you open multiple tabs? Or did you click “Proceed” button after data filtering? There are some steps you need to perform sequentially following the page

Hi Jeff

thank you so much for your prompt response!

I did only use one tab. just in case, I have also deleted the cache and cookies, and then tried again on both the dev.metaboanalyst.ca and www.metaboanalyst.ca domains. unfortunately the issue persists. I’ve got some screenshots here.

at the data filter step, I click “Submit” first, and go with the “Proceed” button. I had a clear notification that the filtration was done successfully.

but, for example, in the t-test, you still see all the features got analysed. the other stats tools also do the same.

You need to provide complete steps - and R history.

sure, Jeff.

  1. download the test data .csv here https://api2.xialab.ca/api/download/metaboanalyst/lcms_table.csv

  2. upload the csv as the screenshot options

  3. Data Integrity Check OK, click proceed

  4. Data Filtering using 10% of low-variance (SD) and low-abundance (mean) filters, submit. Got the filtering notice, saying 331 features remaining
    Snipaste_2025.07.24_20;20;18

  5. Normalization by a pooled sample from group (group PQN) of WT, log2 transformation and Pareto scaling, click “Normalize”

  6. view result, normalisation worked beautifully. click “Proceed”

  7. select T-test, all 407 original features remained.

also attached the R scripts

PID of current job: 3184526

mSet<-InitDataObjects(“pktable”, “stat”, FALSE, 150)
mSet<-Read.TextData(mSet, “Replacing_with_your_file_path”, “colu”, “disc”);
mSet<-SanityCheckData(mSet)
mSet<-ReplaceMin(mSet);
mSet<-CheckContainsBlank(mSet)
mSet<-RemoveMissingByPercent(mSet, percent=1.0, F)
mSet<-FilterVariable(mSet, “F”, 20, “sd”, 10, “mean”, 10, F,10.0)
mSet<-PreparePrenormData(mSet)
mSet<-Normalization(mSet, “GroupPQN”, “Log2Norm”, “ParetoNorm”, “WT”, ratio=FALSE, ratioNum=20)
mSet<-PlotNormSummary(mSet, “norm_0_”, “png”, 150, width=NA)
mSet<-PlotSampleNormSummary(mSet, “snorm_0_”, “png”, 150, width=NA)
mSet<-Ttests.Anal(mSet, F, 0.05, FALSE, TRUE, “fdr”, FALSE)
mSet<-PlotTT(mSet, “tt_0_”, “png”, 150, width=NA)

Also, I just tested this using MetaboAnalystR with the script above, and the output is the same. It seems this behaviour—ignoring the data filtering—is inherited in how MetaboAnalyst handles things.

While this might not be a big issue for something like t-tests, it becomes quite problematic when it comes to PCA. Including features that were already filtered out really undermines the analysis.

I can see the issue with this new data. It is fixed for this scenario - pending on server update.

Note the first data (MS peak lists) should work as expected (not affected by this).

Thank you so much, Jeff!

I realised that I put the wrong link in the first post. I meant to put the “MS peak” not the “MS peak list” there.

Would you be able to advise when the update will take effect?