This article is part of the supplement: Proceedings of the 2011 International Conference on Bioinformatics and Computational Biology (BIOCOMP'11)
Genome-wide prediction and analysis of human tissue-selective genes using microarray expression data
1 Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA
2 Harvard Medical School, Harvard University, P.O. Box 400888, Cambridge, MA 02115, USA
3 J.C. Self Research Institute of Human Genetics, Greenwood Genetic Center, Greenwood, SC 29646, USA
BMC Medical Genomics 2013, 6(Suppl 1):S10 doi:10.1186/1755-8794-6-S1-S10Published: 23 January 2013
Understanding how genes are expressed specifically in particular tissues is a fundamental question in developmental biology. Many tissue-specific genes are involved in the pathogenesis of complex human diseases. However, experimental identification of tissue-specific genes is time consuming and difficult. The accurate predictions of tissue-specific gene targets could provide useful information for biomarker development and drug target identification.
In this study, we have developed a machine learning approach for predicting the human tissue-specific genes using microarray expression data. The lists of known tissue-specific genes for different tissues were collected from UniProt database, and the expression data retrieved from the previously compiled dataset according to the lists were used for input vector encoding. Random Forests (RFs) and Support Vector Machines (SVMs) were used to construct accurate classifiers. The RF classifiers were found to outperform SVM models for tissue-specific gene prediction. The results suggest that the candidate genes for brain or liver specific expression can provide valuable information for further experimental studies. Our approach was also applied for identifying tissue-selective gene targets for different types of tissues.
A machine learning approach has been developed for accurately identifying the candidate genes for tissue specific/selective expression. The approach provides an efficient way to select some interesting genes for developing new biomedical markers and improve our knowledge of tissue-specific expression.