This article is part of the supplement: Selected Proceedings of the 2010 AMIA Summit on Translational Bioinformatics
Assessing the quality of annotations in asthma gene expression experiments
- Equal contributors
1 Decision Systems Group, Brigham & Women’s Hospital, Harvard Medical School, Boston, MA, USA
2 Ohio State University, Columbus, OH, USA
3 University of Arizona, Tucson, AZ, USA
4 Division of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
BMC Bioinformatics 2010, 11(Suppl 9):S8 doi:10.1186/1471-2105-11-S9-S8Published: 28 October 2010
The amount of data deposited in the Gene Expression Omnibus (GEO) has expanded significantly. It is important to ensure that these data are properly annotated with clinical data and descriptions of experimental conditions so that they can be useful for future analysis. This study assesses the adequacy of documented asthma markers in GEO. Three objective measures (coverage, consistency and association) were used for evaluation of annotations contained in 17 asthma studies.
There were 918 asthma samples with 20,640 annotated markers. Of these markers, only 10,419 had documented values (50% coverage). In one study carefully examined for consistency, there were discrepancies in drug name usage, with brand name and generic name used in different sections to refer to the same drug. Annotated markers showed adequate association with other relevant variables (i.e. the use of medication only when its corresponding disease state was present).
There is inadequate variable coverage within GEO and usage of terms lacks consistency. Association between relevant variables, however, was adequate.