Frequent occurrence of recognition Site-like sequences in the restriction endonucleases
1 Karolinska Institute, Stockholm, Sweden
2 Homulus Informatics, 88 Howard, # 1205, San Francisco, 94 105 CA, USA
BMC Bioinformatics 2004, 5:30 doi:10.1186/1471-2105-5-30Published: 16 March 2004
There are two different theories about the development of the genetic code. Woese suggested that it was developed in connection with the amino acid repertoire, while Crick argued that any connection between codons and amino acids is only the result of an "accident". This question is fundamental to understand the nature of specific protein-nucleic acid interactions.
The nature of specific protein-nucleic acid interaction between restriction endonucleases (RE) and their recognition sequences (RS) was studied by bioinformatics methods. It was found that the frequency of 5–6 residue long RS-like oligonucleotides is unexpectedly high in the nucleic acid sequence of the corresponding RE (p < 0.05 and p < 0.001 respectively, n = 7). There is an extensive conservation of these RS-like sequences in RE isoschizomers. A review of the seven available crystallographic studies showed that the amino acids coded by codons that are subsets of recognition sequences were often closely located to the RS itself and they were in many cases directly adjacent to the codon-like triplets in the RS.
Fifty-five examples of this codon-amino acid co-localization are found and analyzed, which represents 41.5% of total 132 amino acids which are localized within 8 Å distance to the C1' atoms in the DNA. The average distance between the closest atoms in the codons and amino acids is 5.5 +/- 0.2 Å (mean +/- S.E.M, n = 55), while the distance between the nitrogen and oxygen atoms of the co-localized molecules is significantly shorter, (3.4 +/- 0.2 Å, p < 0.001, n = 15), when positively charged amino acids are involved. This is indicating that an interaction between the nucleic- and amino acids might occur.
We interpret these results in favor of Woese and suggest that the genetic code is "rational" and there is a stereospecific relationship between the codes and the amino acids.