What are the normalization options for proteomics and how should I choose?

Guangyan · March 20, 2026, 10:57pm

ProteoAnalyst provides two stages of normalization:

Sample Normalization (corrects for technical variation between samples, applied first):

None: No sample normalization
Sample-specific factors: Scale samples using manual factors or a metadata column (e.g., protein loading, sample weight)
Constant sum: Normalize all samples to equal total abundance (1000)
Sample median: Normalize all samples to equal median abundance
Probabilistic Quotient Normalization (PQN): Corrects for dilution effects by computing quotients relative to a reference sample

Data Transformation (corrects for intensity distribution skewness):

None: No transformation
Log2 Transformation: Standard log2 transform of intensity values
Log2 Transformation + Median Centering: Log2 transform followed by centering each sample to a common median
Variance Stabilizing Normalization (VSN): Stabilizes variance across the intensity range using a generalized log transformation
Robust Linear Regression Normalization (RLR): Fits each sample against the global median profile using robust linear regression
Local Regression Normalization (LOESS): Uses local regression to correct intensity-dependent bias between samples

How to choose:

Use Sample Normalization when you have known technical differences between samples (e.g., different loading amounts). You can evaluate results using the diagnostic plots (box plots, PCA, MA plots) shown after normalization.

For most label-free proteomics experiments, Log2 Transformation + Median Centering is the recommended default. If you observe strong intensity-dependent bias (visible in MA plots), consider LOESS or RLR. VSN is a good alternative when variance is not stable across the intensity range.