Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt) is an evolutionary modeling algorithm designed to estimate the gene families contributed to a metagenome by bacteria or archaea identified using 16S rRNA sequencing. The underlying assumption is that taxonomically similar organisms will have similar functional capabilities.
PICRUSt consists of two key steps:
- Gene content Inference: It estimates the gene content of microorganisms for which no genome sequence is available, by using their sequenced relatives as a reference. In particular, PICRUSt estimates the properties of ancestral organisms through ancestral state reconstruction (ASR).
- Metagenome Inference: Since the number of gene copies for each gene family per organism has already been estimated in the pre-calculated files, producing a metagenome prediction is handled by simply multiplying the vector of gene counts for each OTU by the abundance of that OTU in each each sample, and summed across all OTUs.
For more details about the methodology, please visit the PICRUSt page.