How can I detect and deal with outliers?

Outliers refers to those samples that are significantly different from the “majority”. The potential outlier will distinguish itself as the sample located far away from major clusters or trends formed by the remaining data. To discover potential outliers, users can use a variety of summary plots to visualize their data, such as:

  • A sample with extreme diversity (alpha or beta)
  • A sample with very low sequencing depth (rarefaction curve analysis or sample size viewer)

Outliers may be arise due to biological or technical reasons. To deal with outliers, the first step is to check if those samples were measured properly. In many cases, outliers are the result of operational errors during the analytical process. If those values cannot be corrected (i.e. by normalization procedures), they should be removed from your analysis via Sample Editor.