How are missing values imputed for EcoToxChip?

Imputation assumes that missing and high Ct values represent situations of very low or no expression. However, normalization and differential expression analysis algorithms do not perform well when there are missing entries in the data matrix, and so we aim to replace them with values that represent very low expression. Giving all missing measurements the exact same value causes problems during differential expression, potentially leading to artificially low p-values.

The first step is to decide the Ct cut-off for reliable measurements, as accuracy tends to decrease at high cycle numbers. Then, all missing values and values above the cut-off are replaced by randomly drawn values from a normal distribution that has a mean of the Ct cut-off and a standard deviation of the original data surrounding this cut-off. For example, in the image above, the Ct-cutoff was chosen to remain at the default of 35. Then, values are randomly drawn from the distribution of the data around this point.