Email updates

Keep up to date with the latest news and content from BMC Medical Genomics and BioMed Central.

Open Access Highly Accessed Software

Genotator: A disease-agnostic tool for genetic annotation of disease

Dennis P Wall1*, Rimma Pivovarov12, Mark Tong1, Jae-Yoon Jung1, Vincent A Fusaro1, Todd F DeLuca1 and Peter J Tonellato1

Author affiliations

1 Center for Biomedical informatics, Harvard Medical School, Boston, MA 02115

2 Department of Biomedical Informatics, Columbia University, New York, NY 10032

For all author emails, please log on.

Citation and License

BMC Medical Genomics 2010, 3:50  doi:10.1186/1755-8794-3-50

Published: 29 October 2010

Abstract

Background

Disease-specific genetic information has been increasing at rapid rates as a consequence of recent improvements and massive cost reductions in sequencing technologies. Numerous systems designed to capture and organize this mounting sea of genetic data have emerged, but these resources differ dramatically in their disease coverage and genetic depth. With few exceptions, researchers must manually search a variety of sites to assemble a complete set of genetic evidence for a particular disease of interest, a process that is both time-consuming and error-prone.

Methods

We designed a real-time aggregation tool that provides both comprehensive coverage and reliable gene-to-disease rankings for any disease. Our tool, called Genotator, automatically integrates data from 11 externally accessible clinical genetics resources and uses these data in a straightforward formula to rank genes in order of disease relevance. We tested the accuracy of coverage of Genotator in three separate diseases for which there exist specialty curated databases, Autism Spectrum Disorder, Parkinson's Disease, and Alzheimer Disease. Genotator is freely available at http://genotator.hms.harvard.edu webcite.

Results

Genotator demonstrated that most of the 11 selected databases contain unique information about the genetic composition of disease, with 2514 genes found in only one of the 11 databases. These findings confirm that the integration of these databases provides a more complete picture than would be possible from any one database alone. Genotator successfully identified at least 75% of the top ranked genes for all three of our use cases, including a 90% concordance with the top 40 ranked candidates for Alzheimer Disease.

Conclusions

As a meta-query engine, Genotator provides high coverage of both historical genetic research as well as recent advances in the genetic understanding of specific diseases. As such, Genotator provides a real-time aggregation of ranked data that remains current with the pace of research in the disease fields. Genotator's algorithm appropriately transforms query terms to match the input requirements of each targeted databases and accurately resolves named synonyms to ensure full coverage of the genetic results with official nomenclature. Genotator generates an excel-style output that is consistent across disease queries and readily importable to other applications.