Open Access Research article

A novel application of quantile regression for identification of biomarkers exemplified by equine cartilage microarray data

Liping Huang1*, Wenying Zhu2, Christopher P Saunders3, James N MacLeod2, Mai Zhou1, Arnold J Stromberg1 and Arne C Bathke1

Author Affiliations

1 Department of Statistics, 815 Patterson Office Tower, University of Kentucky, Lexington, Kentucky, 40508-0027, USA

2 Department of Veterinary Science, Gluck Equine Research Center, Lexington, KY, 40546-0099, USA

3 Document Forensics Laboratory, Department of Applied Information Technology, George Mason University, Fairfax, VA 22030, USA

For all author emails, please log on.

BMC Bioinformatics 2008, 9:300  doi:10.1186/1471-2105-9-300

Published: 2 July 2008



Identification of biomarkers among thousands of genes arrayed for disease classification has been the subject of considerable research in recent years. These studies have focused on disease classification, comparing experimental groups of effected to normal patients. Related experiments can be done to identify tissue-restricted biomarkers, genes with a high level of expression in one tissue compared to other tissue types in the body.


In this study, cartilage was compared with ten other body tissues using a two color array experimental design. Thirty-seven probe sets were identified as cartilage biomarkers. Of these, 13 (35%) have existing annotation associated with cartilage including several well-established cartilage biomarkers. These genes comprise a useful database from which novel targets for cartilage biology research can be selected. We determined cartilage specific Z-scores based on the observed M to classify genes with Z-scores ≥ 1.96 in all ten cartilage/tissue comparisons as cartilage-specific genes.


Quantile regression is a promising method for the analysis of two color array experiments that compare multiple samples in the absence of biological replicates, thereby limiting quantifiable error. We used a nonparametric approach to reveal the relationship between percentiles of M and A, where M is log2(R/G) and A is 0.5 log2(RG) with R representing the gene expression level in cartilage and G representing the gene expression level in one of the other 10 tissues. Then we performed linear quantile regression to identify genes with a cartilage-restricted pattern of expression.