BMC Evolutionary Biology

official impact factor 3.70

Open Access Highly Access Research article

Codon usage bias and the evolution of influenza A viruses. Codon Usage Biases of Influenza Virus

Emily HM Wong1, David K Smith2*, Raul Rabadan3, Malik Peiris1 and Leo LM Poon1*

Author Affiliations

1 Department of Microbiology, The University of Hong Kong, Pokfulam, Hong Kong, China

2 Department of Biochemistry, The University of Hong Kong, Pokfulam, Hong Kong, China

3 Department of Biomedical Informatics and Center for Computational Biology and Bioinformatics, Columbia University College of Physicians and Surgeons, New York, USA

For all author emails, please log on.

BMC Evolutionary Biology 2010, 10:253 doi:10.1186/1471-2148-10-253

Published: 19 August 2010

Additional files

Additional file 1:

CA of human and avian influenza viruses with avian viral subtypes indicated by color. Each viral gene is displayed in a 3 dimensional representation. The X, Y and Z axes are arbitrary scales generated by the CA.

Format: PPT Size: 2.1MB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 2:

Outliers are enclosed by open-boxes. Sequence numbers of outliers are indicated (see Additional file 3). Avian virus outliers are marked in red, while human virus outliers are in black.

Format: PPT Size: 516KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 3:

Descriptions of human and avian viral sequences that were marked as outliers in Additional file 2.

Format: DOC Size: 108KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 4:

Estimation of 3D coordinates of a viral sequence.

Format: XLS Size: 3.4MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 5:

Cross validation of CA of PB2 sequences. Sequences (N = 3366) were randomly assigned to 5 equal groups and CA was performed on any 4 of these dataset (i.e. 80% of the total sequences). Based on the weight generated from the train set, coordinates of the remaining 20% test dataset were predicted by applying the formula similar to the one as described in Additional file 4. Left column: Original graphs as described in Fig. 1A (Human PB2) and Additional file 1 (Avian PB2). Right column: Representative results generated from one of the test dataset.

Format: PPT Size: 1.3MB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 6:

Comparison of the location of recent seasonal human influenza viruses in CA. Left: CA from figure 1 with the coordinates of the recent human H1 and H3 influenza sequences (year 2007 to 2009) predicted from the eigen vectors of the original CA. Right: A CA of the combined set of sequences from Figure 1 and the recent seasonal influenza sequences. Recent seasonal influenza sequences are marked in darker color.

Format: PPT Size: 3.6MB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 7:

The ten sequences that were closest to each of the A/Brevig Mission/1/1918 genes.

Format: DOC Size: 115KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 8:

CA of seasonal human (H1-H3), human H5, swine, avian, canine (H3N8) and equine (H3N8) influenza viruses. Each viral gene is displayed in a 3 dimensional representation. The X, Y and Z axes are arbitrary scales generated by the CA.

Format: PPT Size: 380KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 9:

Overall codon usage of Influenza virus types and their hosts. Under-represented codons (RSCU < 0.6) are highlighted in grey, while the most commonly used codons are in bold.

Format: DOC Size: 157KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 10:

Codons with positive (R ≥ 0.5) and negative (R ≤ -0.5) correlations in codon usage over time of viral isolation in human H1N1, human H3N2 and avian influenza viruses.

Format: DOC Size: 484KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 11:

Correlation coefficient (R) between viral GC content and year of virus isolation.

Format: DOC Size: 29KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 12:

Correlation coefficient (R) between nucleotide usage at the third position of a codon and year of virus isolation.

Format: DOC Size: 38KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 13:

Changes in the correlation between codon usage in PB2 and that in human tissue-specific genes over time of viral isolation. The linear regression line and the correlation coefficient of each dataset are shown.

Format: PPT Size: 287KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data