Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Database

SHV Lactamase Engineering Database: a reconciliation tool for SHV β-lactamases in public databases

Quan K Thai and Juergen Pleiss*

Author Affiliations

Institute of Technical Biochemistry, University of Stuttgart, Allmandring 31, 70569 Stuttgart, Germany

For all author emails, please log on.

BMC Genomics 2010, 11:563  doi:10.1186/1471-2164-11-563

The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2164/11/563


Received:27 April 2010
Accepted:13 October 2010
Published:13 October 2010

© 2010 Thai and Pleiss; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

SHV β-lactamases confer resistance to a broad range of antibiotics by accumulating mutations. The number of SHV variants is steadily increasing. 117 SHV variants have been assigned in the SHV mutation table (http://www.lahey.org/Studies/ webcite). Besides, information about SHV β-lactamases can be found in the rapidly growing NCBI protein database. The SHV β-Lactamase Engineering Database (SHVED) has been developed to collect the SHV β-lactamase sequences from the NCBI protein database and the SHV mutation table. It serves as a tool for the detection and reconciliation of inconsistencies, and for the identification of new SHV variants and amino acid substitutions.

Description

The SHVED contains 200 protein entries with distinct sequences and 20 crystal structures. 83 protein sequences are included in the both the SHV mutation table and the NCBI protein database, while 35 and 82 protein sequences are only in the SHV mutation table and the NCBI protein database, respectively. Of these 82 sequences, 41 originate from microbial sources, and 22 of them are full-length sequences that harbour a mutation profile which has not been classified yet in the SHV mutation table. 27 protein entries from the NCBI protein database were found to have an inconsistency in SHV name identification. These inconsistencies were reconciled using information from the SHV mutation table and stored in the SHVED.

The SHVED is accessible at http://www.LacED.uni-stuttgart.de/classA/SHVED/ webcite. It provides sequences, structures, and a multisequence alignment of SHV β-lactamases with the corrected annotation. Amino acid substitutions at each position are also provided. The SHVED is updated monthly and supplies all data for download.

Conclusions

The SHV β-Lactamase Engineering Database (SHVED) contains information about SHV variants with reconciled annotation. It serves as a tool for detection of inconsistencies in the NCBI protein database, helps to identify new mutations resulting in new SHV variants, and thus supports the investigation of sequence-function relationships of SHV β-lactamases.

Background

Since the application of penicillin to the clinical practice in the 1940s, the effectiveness of β-lactam antibiotics have been reduced drastically [1-3]. One of the main reasons is the hydrolysis of their β-lactam ring by β-lactamases (EC 3.5.2.6) resulting in a loss of function. These enzymes, especially SHV and TEM β-lactamase variants, accumulate mutations gradually [4,5] to resist β-lactam antibiotics and rapidly spread over the world [6-8].

SHV β-lactamases belong to class A β-lactamases and have a serine in the active site [9]. The premature protein consists of 286 amino acids. The first 21 amino acids at the N-terminus form the signal sequence and are removed to yield the mature enzyme [10]. SHV β-lactamases were first described in the members of the genus Klebsiella as a narrow-spectrum β-lactamase against penicillin [6,11]. Their genes are located either in the bacterial chromosome or on a plasmid [12]. Genes encoding these enzymes have been mutated rapidly and transferred to other Gram-negative bacteria in different geographical regions [6]. Currently, 117 SHV variants have been described. A list of assigned SHV variants was compiled and maintained by Jacoby and Bush [13] which is referred further in this paper as "SHV mutation table". Beside the SHV mutation table, sequence information on SHV b-lactamases can also be found in the NCBI protein database [14]. One of the important data sources of the NCBI protein database is the NCBI nucleotide database which is open for submission of new sequences without further validation; therefore it is growing rapidly, but contains inconsistencies. In contrast, the SHV mutation table is manually curated by experts in the b-lactamase field and therefore is widely accepted as a reliable and consistent information source. In the SHV mutation table, each SHV variant is characterized by its name and mutation profile which is a set of amino acid substitutions at certain positions in the sequence. Positions are identified according to the Ambler numbering scheme [15]. To become listed in the SHV mutation table as a new SHV b-lactamase, it must have arisen naturally, is fully sequenced, and harbors a new mutation profile [13]. Therefore, engineered proteins are not considered.

The SHV Engineering Database (SHVED) was built up as a comprehensive inventory by collecting data on SHV b-lactamases from these two databases to facilitate detection of inconsistencies in entries derived from NCBI protein database and to eventually reconcile them, to detect new SHV β-lactamases with novel mutation profiles, and to identify new amino acid positions at which mutations can occur.

Construction and content

Construction

Development and construction of SHVED

Amino acid sequence of SHV-1 originated from Klebsiella pneumoniae (GenInfo (GI): 4337048) was used as a seed sequence for building up the SHVED. A BLAST search [16] was performed against the NCBI protein database [14] without filtering of low complexity regions and with a low E-value threshold (10-124) to prevent the occurrence of TEM lactamases and other non-SHV lactamases in the BLAST results. For each hit in the BLAST result, the GI was extracted and the complete XML entry was downloaded from the NCBI protein database. Information on sequence, position-specific annotations, functional descriptions, and source organism was extracted from the entry and parsed by an automated retrieval system into an in-house developed relational database system [17]. For BLAST results representing protein structures, monomers were extracted from the PDB [18] and deposited as structure entries.

Sequences generated from the annotated mutation profiles deposited in the SHV mutation table [13] were also incorporated into the SHVED. Except for 16 assigned SHVs which were "withdrawn" or "not yet released", 117 assigned SHV sequences were generated and parsed into the SHVED using the available information on amino acid exchanges and the reference sequence SHV-1. On the webpage, the "source organism" of these sequences was set to "Clinical sample" and the data source to 'lc' abbreviated from "Lahey Clinic" where the SHV mutation table is hosted.

Identification and naming of SHV β-lactamase sequences

Each protein sequence in the SHVED was aligned with SHV-1 using ClustalW [19] to identify its mutation profile. This mutation profile is the set of amino acid exchanges, deletions, and insertions occurring in a certain SHV, e.g. L35Q for the substitution of leucine at position 35 by glutamine. Subsequently, the mutation profile was matched against the mutation profiles listed in the SHV mutation table to identify whether the respective protein sequence is identical to an already assigned SHV. If the mutation profiles were identical, the protein was named accordingly (e.g. "SHV-3"). Otherwise it was named "SHV-like" and its mutation profile was stored. In the case of sequences longer than SHV-1, only the region corresponding to SHV-1 was examined to identify the mutation profile. Amino acid insertions arising inside the protein sequence were annotated, e.g. "-162.1D -162.2R" for the insertion of two residues aspartic acid and arginine after the residue at position 162. The amino acid deletion was annotated with the corresponding residue and position, e.g. "G54-" for the deletion of a glycine at position 54.

For sequences longer than SHV-1, the number of additional residues was recorded, e.g. "C+5" for a sequence 5 residues longer at its C terminus. Sequences shorter than SHV-1 were considered as fragments of the respective SHV sequences or the SHV-like sequences, although they were probably named differently in the entry of the source database. The number of missing residues at the N- and C- terminus were annotated, e.g. "N-21 C-3" for 21 and 3 residues missing at the N- and C- terminus, respectively.

Multisequence alignment and feature annotation

The annotation information was enriched by performing multisequence alignment using CLUSTALW [19]. Information on secondary structure calculated using DSSP [20] were also included in the SHVED. Individual residues in the sequence as well as in the alignments were numbered according to the standard scheme suggested by Ambler [15]

Reconciliation of data inconsistencies

A systematic comparison of entries of the NCBI protein database and the SHV mutation table allows a reconciliation of NCBI protein database entries which have an inconsistent annotation. In the SHVED, the wrong name assignment is corrected if its mutation profile is already included in SHV mutation table. A sequence with a new mutation profile is stored in the SHVED as new SHV β-lactamase, even if it has been named by the authors by a (wrong) SHV name in the NCBI protein database. A link from the reconciled SHVED entry to the original NCBI protein database entry allows the author of the respective entry to correct an erroneous entry.

Content

Data content of the SHVED

452 protein sequence entries from NCBI protein database and 117 protein sequences from SHV mutation table were collected and parsed into the SHVED, resulting in 200 distinct protein entries. 20 crystal structures of 2 SHV β-lactamases (SHV-1 and SHV-2) were stored in the SHVED. 19 crystal structures were from SHV-1 with one or two engineered mutations. Apart from the structure (PDB entry 3D4F) which is full-length sequence, all crystal structures lack the 21 residues of the N-terminal signal sequence. Two protein sequences (PDB entries 2A3U and 2A49) possess 5 and 4 additional residues, respectively, at their C-terminus (Table 1).

Table 1. PDB code of crystal structure entries in SHVED and their sequence annotations

Of the 200 proteins, 35 SHV sequences were derived from SHV mutation table, but not from the NCBI protein database, 82 protein sequences were exclusively found in the NCBI protein database, and 83 protein sequences were accessible in both source databases. In 82 protein sequences found only in the NCBI protein database, there are 41 sequences which originate from microbial sources and harbor a new mutation profile. 22 are full-length sequences (table 2) and 19 are fragments (table 3).

Table 2. New mutation profiles of full length sequences originating from microorganisms

Table 3. Fragments with new mutation profiles

Analysis of amino acid substitutions and substitution positions

In addition to the amino acid substitutions described in the SHV mutation table [13], 27 new substitution positions in protein entries originating from microbial sources have been identified. 11 new substitution positions found in full length sequences (table S1, Additional file 1) and 18 new substitution positions were found in fragments (table S2, Additional file 1), in which 2 new substitution positions could be found both in full length sequences and in fragments (positions 6 and 289). These new substitution positions spread over the complete protein sequence, including the signal peptide and the C-terminus. Most of the substitutions found in full length sequences are located at the protein surface and are distant from the active site, except for T235 and I260 (figure 1). Of the 18 new substitution positions found in fragments, 9 positions are at the C terminus, 4 positions on the protein surface, 3 positions in the protein core, and 2 in the signal peptide (figure 2). Not only the substitution at new positions, but also new amino acid exchanges at already known positions were found. As an example, the protein sequence with GI 259038268 harbors an lysine at the position 252 instead of a proline. In the SHV mutation table, only the substitution P252G is described.

thumbnailFigure 1. The structure of SHV-1 β-lactamases (PDB entry 1SHV) with new substitution positions found in full length sequences. Amino acid side chains are shown in stick representation: substitutions occurring at novel positions (green), novel amino acid substitution at known position (red), active site residues (yellow).

thumbnailFigure 2. The structure of SHV-1 β-lactamases (PDB entry 1SHV) with new substitution positions found in fragments. Amino acid side chains are shown in stick representation: substitutions occurring at novel positions (green), novel amino acid substitution at known position (red), active site residues (yellow).

Additional file 1. Additional_file_1.pdf contains table S1 and table S2 mentioned in the text. They list new mutation profiles of sequences derived from microbial organisms.

Format: PDF Size: 16KB Download file

This file can be viewed with: Adobe Acrobat ReaderOpen Data

Data inconsistencies

There are 27 distinct protein entries derived from the NCBI protein database having inconsistent annotations (table 4). In all cases, the annotated SHV name is inconsistent with its mutation profile. For example, the protein sequence with GI 40950644 has three mutations (L35Q, G238S, and E240K), therefore, it should be named "SHV-12" according to the SHV mutation table, but it is actually annotated as "beta-lactamase SHV-5" in the NCBI protein database. In 12 cases, the protein sequence is a fragment and therefore there is not enough information to rename it in the SHVED.

Table 4. Inconsistencies between information from NCBI protein database and SHV mutation table

Utility

A multisequence alignment of all 200 protein entries was generated using CLUSTALW. For protein structures, all sequence entries were included and displayed with aligned secondary structure information. Proteins were labeled by the GIs and linked to the NCBI protein database. Annotation of individual residues is visualized by color-coding in the alignment and upon moving the cursor over the respective residue. The SHVED is accessible at http://www.LacED.uni-stuttgart.de/classA/SHVED webcite by a JavaScript-enabled WWW browser. Protein tables provide information on the protein name, mutation, number of residues missing at the N- and C-terminal (in case of fragments), and on the source organism. As an alternative to the multisequence alignment, the SHV variants are visualized as mutations relative to the sequence of SHV-1. Substitution positions are colored and annotated by the exchanged amino acids.

Discussion

Data content of the SHVED

By systematic analysis of protein sequences in the SHVED, 41 protein sequences with a new mutation profile were identified. 22 of them are full length sequences originating from microbial sources and therefore are candidates for a new SHV number assignment. The new mutations occurred either at new position on the sequence or they were new amino acid exchange at already described positions.

Detection of novel SHV β-lactamases and novel amino acid substitutions

Except for one new mutation profile originating from a synthetic construct (GI 151861), all new mutation profiles originated from microbial sources. As a plasmid-bound gene, the SHV β-lactamase encoding blaSHV genes are easily transferred among the members of Gram-negative bacteria, especially Enterobacteriaceae because of their close genetic relationship [6]. Thus, most of the newly detected SHV β-lactamases are from Enterobacteriaceae such as Klebsiella pneumoniae (14 SHVs), Escherichia coli (15 SHVs), Enterobacter cloacae (1 SHV), from both K.pneumoniae and E.coli (1 SHV), and from both K.pneumoniae and E.cloacae (1 SHV). Additionally, 3 new SHV variants were found in Acinetobacter baumannii and 1 new SHV variant was found in Salmonella enterica. Although 19 fragments harbor a new mutation profile, they can not be assigned to a new SHV number because of missing sequence information. However, the information about the substitution at new positions found in these fragments could be used in the future to predict the occurrence of new SHV variants.

Data inconsistencies and reconciliation

In all 27 cases of inconsistency, the annotated name differed from the actual mutation profile. However, the reasons of the inconsistency varied. In the case of the protein sequence with GI 154269503, the lysine at position 256 is substituted by an arginine, while it is reported that the lysine is exchanged by an arginine at position 250 (K250R) [21]. In the SHV mutation table, it is listed as SHV-103 and characterized by the substitution of a leucine at position 250 by an arginine (L250R). A mutation at position 256 is not yet recorded in the SHV mutation table, and the mutation at position 250 can only be seen in the SHV-103. Probably, the difference in amino acid numbering by the author of GI 154269503 and by the curators of the SHV mutation table at Lahey Clinic caused the inconsistence. In the case of the protein sequence with GI 161367444, the inconsistency might derive from the primer used. In the sequence, only one mutation R202S was found, while it is annotated as SHV-104 which has two mutations (M5L and R202S) according to the SHV mutation table. It is noted in the NCBI entry that the forward primer "ATGCGTTATATTCGCCTGTGTATT" was used to amplify the target DNA, which results a methionine at position 5. Therefore, the deduced amino acid substitution M5L (if it actually occurred) could not be present in the deposited amino acid sequence, and the deposited amino acid sequence should not be annotated as SHV-104 because it does not harbor the mutation profile 'M5L R202S'. In the case of the protein sequence with GI 15718691, the duplication of a pentapeptide 163DRWET167 was reported [22] and assigned as SHV-16. But in addition, two mutations H96T and Y97H are present in the amino acid sequence. Therefore, it is not clear whether the actual SHV-16 harbors only the pentapeptide duplication or additionally the mutations H96T and Y97H. In other cases of inconsistency, the amino acid sequences were submitted to the NCBI protein database without corresponding publication and showed inconsistencies in their annotation. One example is the protein sequence with GI 30230495. It is annotated as SHV-48 which should harbor mutation V119I according to the SHV mutation table, while actually four mutations (L35Q, R191H, G238S, and E240K) were found in the deposited amino acid sequence. In the SHV mutation table, an inconsistency in residue numbering (position 253 and 255) was revealed and communicated to the curator for correction.

Conclusion

The SHV Lactamase Engineering Database (SHVED) was established to identify new SHV β-lactamases and to identify inconsistencies in public databases. Based on our analysis, 22 candidates for assignment of new SHV names were identified. 27 proteins entries with inconsistencies were found and reconciled. Also, three assigned mutation profiles were identified to be in doubt: SHV-16, SHV-103, and SHV-104. The SHVED thus supports the scientific community to name new SHV β-lactamases and to reconcile existing annotation of SHV β-lactamases sequences.

Availability and requirements

The SHVED is accessible at http://www.LacED.uni-stuttgart.de/classA/SHVED/ webcite by a JavaScript-enabled WWW browser.

Abbreviations

GI: GenInfo Identifier; SHVED: SHV Engineering Database

Authors' contributions

QKT developed the database, built the web pages, analyzed the data, and drafted the manuscript. JP supervised the study and finalized the manuscript. All authors read and approved the final manuscript.

Acknowledgements

We acknowledge Florian Wagner for valuable discussions and his support in building up the database. This work was supported by the Federal Ministry of Education and Research of Germany (VNB 04/B12).

References

  1. Livermore DM: beta-Lactamases in laboratory and clinical resistance.

    Clin Microbiol Rev 1995, 8(4):557-584. PubMed Abstract | PubMed Central Full Text OpenURL

  2. Paterson DL, Bonomo RA: Extended-spectrum beta-lactamases: a clinical update.

    Clin Microbiol Rev 2005, 18(4):657-686. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  3. Bradford PA: Extended-spectrum beta-lactamases in the 21st century: characterization, epidemiology, and detection of this important resistance threat.

    Clin Microbiol Rev 2001, 14(4):933-951.

    table of contents

    PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  4. Gniadkowski M: Evolution of extended-spectrum beta-lactamases by mutation.

    Clin Microbiol Infect 2008, 14(1):11-32. PubMed Abstract | Publisher Full Text OpenURL

  5. Hall BG, Barlow M: Evolution of the serine beta-lactamases: past, present and future.

    Drug Resist Updat 2004, 7(2):111-123. PubMed Abstract | Publisher Full Text OpenURL

  6. Heritage J, M'Zali FH, Gascoyne-Binzi D, Hawkey PM: Evolution and spread of SHV extended-spectrum beta-lactamases in Gram-negative bacteria.

    Journal of Antimicrobial Chemotherapy 1999, 44(3):309-318. PubMed Abstract | Publisher Full Text OpenURL

  7. Paterson DL, Hujer KM, Hujer AM, Yeiser B, Bonomo MD, Rice LB, Bonomo RA: Extended-spectrum beta-lactamases in Klebsiella pneumoniae bloodstream isolates from seven countries: dominance and widespread prevalence of SHV- and CTX-M-type beta-lactamases.

    Antimicrob Agents Chemother 2003, 47(11):3554-3560. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  8. Chang FY, Siu LK, Fung CP, Huang MH, Ho M: Diversity of SHV and TEM beta-lactamases in Klebsiella pneumoniae: gene evolution in Northern Taiwan and two novel beta-lactamases, SHV-25 and SHV-26.

    Antimicrob Agents Chemother 2001, 45(9):2407-2413. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  9. Ambler RP: The structure of beta-lactamases.

    Philos Trans R Soc Lond B Biol Sci 1980, 289(1036):321-331. PubMed Abstract | Publisher Full Text OpenURL

  10. Kuzin AP, Nukaga M, Nukaga Y, Hujer AM, Bonomo RA, Knox JR: Structure of the SHV-1 beta-lactamase.

    Biochemistry 1999, 38(18):5720-5727. PubMed Abstract | Publisher Full Text OpenURL

  11. Nugent ME, Hedges RW: The nature of the genetic determinant for the SHV-1 beta-lactamase.

    Mol Gen Genet 1979, 175(3):239-243. PubMed Abstract | Publisher Full Text OpenURL

  12. Ford PJ, Avison MB: Evolutionary mapping of the SHV beta-lactamase and evidence for two separate IS26-dependent blaSHV mobilization events from the Klebsiella pneumoniae chromosome.

    J Antimicrob Chemother 2004, 54(1):69-75. PubMed Abstract | Publisher Full Text OpenURL

  13. Jacoby G, Bush K: SHV Extended-Spectrum and Inhibitor Resistant ß-Lactamases. [http://www.lahey.org/Studies/] webcite

  14. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL: GenBank.

    Nucleic Acids Res 2008, 36(Database):D25-30. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  15. Ambler RP, Coulson AF, Frere JM, Ghuysen JM, Joris B, Forsman M, Levesque RC, Tiraby G, Waley SG: A standard numbering scheme for the class A beta-lactamases.

    Biochem J 1991, 276(Pt 1):269-270. PubMed Abstract | PubMed Central Full Text OpenURL

  16. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

    Nucleic Acids Res 1997, 25(17):3389-3402. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  17. Fischer M, Thai QK, Grieb M, Pleiss J: DWARF--a data warehouse system for analyzing protein families.

    BMC Bioinformatics 2006, 7:495. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  18. Schwede T, Diemand A, Guex N, Peitsch MC: Protein structure computing in the genomic era.

    Res Microbiol 2000, 151(2):107-112. PubMed Abstract | Publisher Full Text OpenURL

  19. Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.

    Nucleic Acids Res 1994, 22(22):4673-4680. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  20. Kabsch W, Sander C: Dictionary of Protein Secondary Structure - Pattern-Recognition of Hydrogen-Bonded and Geometrical Features.

    Biopolymers 1983, 22(12):2577-2637. PubMed Abstract | Publisher Full Text OpenURL

  21. Abbassi MS, Torres C, Achour W, Vinue L, Saenz Y, Costa D, Bouchami O, Ben Hassen A: Genetic characterisation of CTX-M-15-producing Klebsiella pneumoniae and Escherichia coli strains isolated from stem cell transplant patients in Tunisia.

    International Journal of Antimicrobial Agents 2008, 32(4):308-314. PubMed Abstract | Publisher Full Text OpenURL

  22. Arpin C, Labia R, Andre C, Frigo CC, El Harrif Z, Quentin C: SHV-16, a beta-lactamase with a pentapeptide duplication in the omega loop.

    Antimicrobial Agents and Chemotherapy 2001, 45(9):2480-2485. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL