Hi all,
I have really enjoyed using MicrobiomeAnalyst it has been by far the easiest and most useful bioinformatic tool I have used so far. I had a few questions regarding rarefying/normalizing data inside MB analyst versus outside of it.
I have Illumina 250bp paired-end 16s sequencing data from three sample types within an urban river (crayfish, sediment, and water) and I have a large disparity of read counts between different samples (~300 reads - ~160,000 reads). Based on the large difference I had planned to rarify the data in MB analyst and then continue the analysis. However, I am worried that rarefying to minimum library size will remove too much of the data from the samples with a high read count, and removing the samples with low read counts will remove too many samples to answer our original research question.
I was reading through some literature and the documentation of phyloseq about normalizing microbiome data with uneven sequencing depth and found that rarefying was not recommended due to the high chance of type II errors. McMurdie and Homes 2014, recommended to normalize data in DESeq2, edgeR, or metagenomeSeq packages.
Can data normalized in any of those packages be used as input data into MB analyst or is there another recommended solution to deal with uneven sequencing depth in a data set without filtering out all of the low-read samples?
Thanks!
-Grant