The method of input data processing by Data Integrity Check

i have data set of 202,000 sequence read, when i upload the data and make Data Integrity Check my data set reduce to about half (about 101,000) that is because:

  • Features with identical values (i.e. zeros) across all samples will be excluded;
  • Features that appear in only one sample will be excluded (considered artifacts)

but, indeed, the excluded features are important for my treatments and i don not want to excluded them. how i can solve this problem?



Thanks for posing the question. Our website is designed for efficient statistical analysis of microbiome data. The filter step is necessary for getting meaningful statistical result in most cases and also computational friendly. If you want all the features to be processed, you can run the analysis locally using R packages such as phyloseq. Our new package MicrobiomeAnalystR will be released soon, which will also be a good choice.
If you have more consideration or suggestion, feel free to contact us.

Hope this helps!

1 Like