Human reference studies data have been downloaded from the publicly accessible QIITA. In these datasets, taxon names are labelled with a Greengenes identifier. Reference datasets have been separated based on the sampling body sites and the sequencing platforms used to generate the datasets. For details about the reference studies and their methodologies, please refer to the paper by Lozupone CA et al.
Once the users upload their human microbiome data, MicrobiomeAnalyst tries to merge it with a selected reference dataset based on common taxon names. At least 20 percent of OTUs between user data and reference data have to match in order to proceed further. A common data filtering and normalization method is applied on merged data to perform PCoA analysis using various distance measures.