Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Research article

A survey of orphan enzyme activities

Yannick Pouliot12 and Peter D Karp1*

Author Affiliations

1 Bioinformatics Research Group, Artificial Intelligence Center, SRI International, 333 Ravenswood Ave, Menlo Park, California, 94025-3493, USA

2 Lane Medical Library and Knowledge Management Center, Information Resources and Technology, Stanford University Medical Center, 300 Pasteur Drive. Stanford, CA 94305-5123, USA

For all author emails, please log on.

BMC Bioinformatics 2007, 8:244  doi:10.1186/1471-2105-8-244

Published: 10 July 2007



Using computational database searches, we have demonstrated previously that no gene sequences could be found for at least 36% of enzyme activities that have been assigned an Enzyme Commission number. Here we present a follow-up literature-based survey involving a statistically significant sample of such "orphan" activities. The survey was intended to determine whether sequences for these enzyme activities are truly unknown, or whether these sequences are absent from the public sequence databases but can be found in the literature.


We demonstrate that for ~80% of sampled orphans, the absence of sequence data is bona fide. Our analyses further substantiate the notion that many of these enzyme activities play biologically important roles.


This survey points toward significant scientific cost of having such a large fraction of characterized enzyme activities disconnected from sequence data. It also suggests that a larger effort, beginning with a comprehensive survey of all putative orphan activities, would resolve nearly 300 artifactual orphans and reconnect a wealth of enzyme research with modern genomics. For these reasons, we propose that a systematic effort to identify the cognate genes of orphan enzymes be undertaken.