Open Access Highly Accessed Research article

Bioinformatic analysis of ESTs collected by Sanger and pyrosequencing methods for a keystone forest tree species: oak

Saneyoshi Ueno12, Grégoire Le Provost1, Valérie Léger1, Christophe Klopp3, Céline Noirot3, Jean-Marc Frigerio1, Franck Salin1, Jérôme Salse4, Michael Abrouk4, Florent Murat4, Oliver Brendel5, Jérémy Derory1, Pierre Abadie1, Patrick Léger1, Cyril Cabane67, Aurélien Barré6, Antoine de Daruvar67, Arnaud Couloux8, Patrick Wincker8, Marie-Pierre Reviron1, Antoine Kremer1 and Christophe Plomion1*

Author Affiliations

1 INRA, UMR 1202 BIOGECO, 69 route d'Arcachon, F-33612 Cestas, France

2 Forestry and Forest Products Research Institute, Department of Forest Genetics, Tree Genetics Laboratory, 1 Matsunosato, Tsukuba, Ibaraki, 305-8687, Japan

3 Plateforme bioinformatique Genotoul, UR875 Biométrie et Intelligence Artificielle, INRA, 31326 Castanet-Tolosan, France

4 INRA/UBP UMR 1095, Laboratoire Génétique, Diversité et Ecophysiologie des Céréales, 234 avenue du Brézet, 63100 Clermont Ferrand, France

5 INRA, UMR1137 EEF "Ecologie et Ecophysiologie Forestières", F 54280 Champenoux, France

6 Université de Bordeaux, Centre de Bioinformatique de Bordeaux, Bordeaux, France

7 CNRS, UMR 5800, Laboratoire Bordelais de Recherche en Informatique, Talence, France

8 CEA, DSV, Genoscope, Centre National de Séquençage, 2 rue Gaston Crémieux CP5706 91057 Evry cedex, France

For all author emails, please log on.

BMC Genomics 2010, 11:650  doi:10.1186/1471-2164-11-650

Published: 23 November 2010

Additional files

Additional file 1:

Table S1: Number of reads with significant Blast hits against E. coli, phage and yeast sequences for libraries pyrosequenced by Roche 454.

Format: PDF Size: 43KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 2:

Table S2: Newbler assembly in NG6 http://vm-bioinfo.toulouse.inra.fr/ng6/ webcitefor libraries pyrosequenced by Roche 454.

Format: PDF Size: 35KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 3:

Figure S1: Sequence length distribution for unigene elements constructed by (A) PartiGene, (B) TGICL and (C) MIRA. Unigene elements (contigs and singletons) by PartiGene (A) were from Sanger reads only, while those by MIRA (B) and TGICL (C) were from both Sanger and 454-reads. The unigene elements by TGICL is named as "OakContigV1".

Format: PPT Size: 206KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 4:

Figure S2: Distribution of the number of reads in a contig (depth of a contig). Contigs resulting from PartiGene (brown bar), TGICL (green bar) and MIRA (blue bar) analysis.

Format: PPT Size: 198KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 5:

Figure S3: Frequency distribution of the number of peptides predicted from unigene elements. Frequency of FrameDP-predicted peptides resulting from PartiGene (brown bar), TGICL (green bar) and MIRA (blue bar) assembly.

Format: PPT Size: 182KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 6:

Table S3: Oak homologs to poplar candidate genes for bud phenology.

Format: PDF Size: 42KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 7:

Table S4: Oak homologs with Arabidopsis thaliana for drought stress resistance related genes with emphasis on cuticle formation.

Format: PDF Size: 70KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 8:

Figure S4: Phenylpropanoid biosynthesis related genes found in OakContigV1. List of genes are as follows with the number of OakContigV1 sequences in parenthesis. Red; EC:2.1.1.104 [caffeoyl-CoA O-methyltransferase] (31), Yellow; EC:1.11.1.7 [peroxidase] (212), Orange; EC:1.1.1.195 [cinnamyl-alcohol dehydrogenase] (28), Green; EC:3.2.1.21 [beta-glucosidase] (54), Blue; EC:2.1.1.68 [caffeate O-methyltransferase] (38), Pink; EC:2.3.1.92 [sinapoylglucose---malate O-sinapoyltransferase] (1), Violet; EC:2.3.1.91 [sinapoylglucose---choline O-sinapoyltransferase] (2), Light-red; EC:1.2.1.68 [coniferyl-aldehyde dehydrogenase] (3), Light-green; EC:1.14.13.11 [trans-cinnamate 4-monooxygenase] (10), Light-yellow; EC:6.2.1.12 [4-coumarate---CoA ligase] (15).

Format: PPT Size: 145KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 9:

Table S5: Homology search results against MAIZEWALL database.

Format: PDF Size: 168KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 10:

Table S6: SSRs detected in OakContigV1 sequences. SSR motifs (5, 4, 3, 3, and 3 repeats at least for di-, tri-, tetra-, penta- and hexa-nucleotides, respectively) were searched by mreps (Kolpakov et al. 2003) program to detect microsatellite repeats from OakContigV1. Annotations are based on BlastX search against SWISS-PROT database with e-value cut-off 1e-5. "nil" indicates no hits.

Format: XLS Size: 6.8MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 11:

Figure S5: Microsatellite frequency detected by mreps for eight gene indices and OakContigV1. The search was performed for di-(with a repeat count n >= 5 repeat units), tri- (n >= 4), tetra- (n >= 3), penta- (n >= 3) and hexa- (n >= 3) nucleotides. The gene indices abbreviations are as follows: AGI; Arabidopsis thaliana, HAGI; Helianthus annuus, NTGI; Nicotiana tabacum, MTGI; Medicago truncatula, OGI; Oryza sativa, PPLGI; Populus, SGI; Picea and VVGI; Vitis vinifera.

Format: PPT Size: 197KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 12:

Figure S6: Estimation of SSR location by analysis with ESTScan and mreps software.

Format: PPT Size: 134KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data