Removing poor quality arrays improves prediction. Plot of Matthews Correlation Coefficient for a clinical parameter, pathologic complete response, versus the number of lowest quality arrays removed for each quality metric. The prediction algorithm used was PAM. Prediction improved when removing the arrays with the poorest quality; however, some metrics did substantially better than others at detecting arrays that negatively affect prediction. RLE and Percent Present appeared to perform best, followed by NUSE and GNUSE. Average background showed no improvement when removing less than 30 arrays.
McCall et al. BMC Bioinformatics 2011 12:137 doi:10.1186/1471-2105-12-137