Table 1 

An overview of the proposed data analysis workflow. 

Steps 
Detailed tasks 
Comments 


Statement of the problem 
• Specify comparisons of interest 
• Express comparisons as statistical hypotheses 
• Define scope of biological replication 
• Restricted scope suitable for screening; expanded scope required for validation 



Exploratory data analysis 
• Detect misidentified features 
• Remove obvious outliers 
• Detect features with missing values 
• Choose imputation strategy 



Modelbased analysis 
• Fit linear mixed model per protein 
• Reduced scope of biological replication = fixed subjects; expanded scope = random subjects 
• Check qqplots plots for Normality 
• If deviations, conclusions are approximate only 

• Check residual plots for equal variance 
• If deviations, use iterative least squares 

• Test comparisons of interest 
• Adjust pvalues per comparison to control FDR 

• Quantify protein abundance in conditions or samples of interest 
• Use as input with downstream clustering or classification 



Design followup experiments 
• Evaluate power and sample size 
• Find minimal sample size for a fold change 
• Find minimal fold change for a sample size 



Supplementary Table 2 shows MSstats commands for each step. 

Clough et al. BMC Bioinformatics 2012 13(Suppl 16):S6 doi:10.1186/1471210513S16S6 