Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Open Badges Research article

Impact of image segmentation on high-content screening data quality for SK-BR-3 cells

Andrew A Hill*, Peter LaPan, Yizheng Li and Steve Haney

Author Affiliations

Department of Biological Technologies, Wyeth Research, 35 CambridgePark Drive, Cambridge, MA 02140, USA

For all author emails, please log on.

BMC Bioinformatics 2007, 8:340  doi:10.1186/1471-2105-8-340

Published: 14 September 2007



High content screening (HCS) is a powerful method for the exploration of cellular signalling and morphology that is rapidly being adopted in cancer research. HCS uses automated microscopy to collect images of cultured cells. The images are subjected to segmentation algorithms to identify cellular structures and quantitate their morphology, for hundreds to millions of individual cells. However, image analysis may be imperfect, especially for "HCS-unfriendly" cell lines whose morphology is not well handled by current image segmentation algorithms. We asked if segmentation errors were common for a clinically relevant cell line, if such errors had measurable effects on the data, and if HCS data could be improved by automated identification of well-segmented cells.


Cases of poor cell body segmentation occurred frequently for the SK-BR-3 cell line. We trained classifiers to identify SK-BR-3 cells that were well segmented. On an independent test set created by human review of cell images, our optimal support-vector machine classifier identified well-segmented cells with 81% accuracy. The dose responses of morphological features were measurably different in well- and poorly-segmented populations. Elimination of the poorly-segmented cell population increased the purity of DNA content distributions, while appropriately retaining biological heterogeneity, and simultaneously increasing our ability to resolve specific morphological changes in perturbed cells.


Image segmentation has a measurable impact on HCS data. The application of a multivariate shape-based filter to identify well-segmented cells improved HCS data quality for an HCS-unfriendly cell line, and could be a valuable post-processing step for some HCS datasets.