How to choose covariates to include in the linear model?

The number of covariates that you include is partially dependent on your sample size. We recommend adjusting for only one covariate if your sample size is less than 30, although it depends on the statistical power that you are comfortable with. See this link for a rough picture of the relationship between the number of predictors, statistical power, and the sample size.

Once you have determined an appropriate number of covariates to include, it is prudent to prioritize those that appear to explain some variation in the metabolomics data but does not correlated with the primary metadata of interest. This can be assessed using the “Correlation” and “iPCA” tools within the multi-factor meta-data module. It is also important to leave out any variables that are highly correlated with the primary variable of interest, as this can lead to highly unstable coefficient estimates (also known as collider bias).