Email updates

Keep up to date with the latest news and content from BMC Medical Genomics and BioMed Central.

This article is part of the supplement: Proceedings of the 2011 International Conference on Bioinformatics and Computational Biology (BIOCOMP'11)

Open Access Research

Genome-wide prediction and analysis of human tissue-selective genes using microarray expression data

Shaolei Teng1, Jack Y Yang2 and Liangjiang Wang13*

Author Affiliations

1 Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA

2 Harvard Medical School, Harvard University, P.O. Box 400888, Cambridge, MA 02115, USA

3 J.C. Self Research Institute of Human Genetics, Greenwood Genetic Center, Greenwood, SC 29646, USA

For all author emails, please log on.

BMC Medical Genomics 2013, 6(Suppl 1):S10  doi:10.1186/1755-8794-6-S1-S10

Published: 23 January 2013

Abstract

Background

Understanding how genes are expressed specifically in particular tissues is a fundamental question in developmental biology. Many tissue-specific genes are involved in the pathogenesis of complex human diseases. However, experimental identification of tissue-specific genes is time consuming and difficult. The accurate predictions of tissue-specific gene targets could provide useful information for biomarker development and drug target identification.

Results

In this study, we have developed a machine learning approach for predicting the human tissue-specific genes using microarray expression data. The lists of known tissue-specific genes for different tissues were collected from UniProt database, and the expression data retrieved from the previously compiled dataset according to the lists were used for input vector encoding. Random Forests (RFs) and Support Vector Machines (SVMs) were used to construct accurate classifiers. The RF classifiers were found to outperform SVM models for tissue-specific gene prediction. The results suggest that the candidate genes for brain or liver specific expression can provide valuable information for further experimental studies. Our approach was also applied for identifying tissue-selective gene targets for different types of tissues.

Conclusions

A machine learning approach has been developed for accurately identifying the candidate genes for tissue specific/selective expression. The approach provides an efficient way to select some interesting genes for developing new biomedical markers and improve our knowledge of tissue-specific expression.