Table 1

An overview of the proposed data analysis workflow.

Steps

Detailed tasks

Comments


Statement of the problem

• Specify comparisons of interest

• Express comparisons as statistical hypotheses

• Define scope of biological replication

• Restricted scope suitable for screening; expanded scope required for validation


Exploratory data analysis

• Detect mis-identified features

• Remove obvious outliers

• Detect features with missing values

• Choose imputation strategy


Model-based analysis

• Fit linear mixed model per protein

• Reduced scope of biological replication = fixed subjects; expanded scope = random subjects

• Check qq-plots plots for Normality

• If deviations, conclusions are approximate only

• Check residual plots for equal variance

• If deviations, use iterative least squares

• Test comparisons of interest

• Adjust p-values per comparison to control FDR

• Quantify protein abundance in conditions or samples of interest

• Use as input with downstream clustering or classification


Design follow-up experiments

• Evaluate power and sample size

• Find minimal sample size for a fold change

• Find minimal fold change for a sample size


Supplementary Table 2 shows MSstats commands for each step.

Clough et al. BMC Bioinformatics 2012 13(Suppl 16):S6   doi:10.1186/1471-2105-13-S16-S6

Open Data