Open Access Highly Accessed Research article

De novo assembly and characterization of the carrot transcriptome reveals novel genes, new markers, and genetic diversity

Massimo Iorizzo1, Douglas A Senalik12, Dariusz Grzebelus3, Megan Bowman1, Pablo F Cavagnaro4, Marta Matvienko67, Hamid Ashrafi5, Allen Van Deynze5 and Philipp W Simon12*

Author Affiliations

1 Department of Horticulture, University of Wisconsin, 1575 Linden Drive, Madison, WI 53706. USA

2 USDA-Agricultural Research Service, Vegetable Crops Research Unit, University of Wisconsin, 1575 Linden Drive, Madison, WI 53706, USA

3 Department of Genetics, Plant Breeding and Seed Science, Agricultural University of Krakow, Al. 29 Listopada 54, 31-425 Krakow, Poland

4 CONICET and INTA EEA La Consulta, CC8 La Consulta (5567), Mendoza, Argentina

5 Seed Biotechnology Center, University of California, 1 Shields Ave, Davis, CA, USA

6 Genome Center, University of California, 1 Shields Ave, Davis, CA, USA

7 Current address: Life Technologies, 850 Lincoln Center Circle, Foster City, CA, USA

For all author emails, please log on.

BMC Genomics 2011, 12:389  doi:10.1186/1471-2164-12-389

Published: 2 August 2011

Additional files

Additional file 1:

Table S1 – Individual genotype transcriptome assemblies. Summary of the number of contigs and singletons obtained for the B493, B493×QAL, B6274 and B7262 individual transcriptome assemblies using different assembly methods. Table S2 – Combined transcriptome assemblies. Summary of the B493, B493×QAL, B6274, and B7262 combined transcriptome assemblies. Table S4 Transposable element superfamilies and families represented in ESTs.Table S7 - Distribution of motif length in the SSR dataset.Table S8 - Comparison of SNP validation rates using intron prediction.Table S9 - Polymorphic SNPs tested in two mapping populations. Summary of results obtained by screening of two mapping population B493xQAL and 70349 using 212 polymorphic SNPs. Figure S1 – Number of contigs vs. length of contigs with hits to NCBI database. Histogram of number of contigs with one or more hits to NCBI database using BLASTX vs. length of the contig sequence. Figure S2 – Genotype transcript contribution to the overall CAP3 assembly. Contribution of transcript sequences from each genotype (B493xQAL, B6274, B7262 and B493) to the overall CAP3 assembly. Figure S3 – Comparative analysis of carrot Sanger-based sequence genes and the corresponding EST contigs. Comparative analysis of carrot Sanger-based sequence genes (A) and the corresponded EST contig (B) from our de novo assembly. The X-axis is the sequence base pair position and the Y-axis indicates read coverage. Different colors identify reads from three different genotypes as green: B493xQAL; yellow: B6274; and violet: B7262. Figure S4 - Intra- and inter-sample SNP distribution. Intra and inter-sample polymorphism distribution of computationally detected SNPs among genotype at a depth of sequence coverage of 20. Inbred line order is B493xQAL, B6274 and B7262. * M=intra-sample monomorphic, inter-sample polymorphic; P= intra and inter-sample polymorphic.

Format: DOC Size: 4MB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 2:

Fasta file with 58,751 assembled sequences.

Format: FASTA Size: 48.5MB Download file

Open Data

Additional file 3:

Table S3 - EST contigs related to transposable elements. EST contigs related to transposable elements, as indicated by BLAST2GO annotation and their classification into subfamilies based on similarity to RepBase entries. Original annotation was maintained for contigs showing no significant similarity to RepBase. Table S5 - EST containing fragments of carrot Tdc transposons. Characteristics of EST contigs containing fragments of carrot Tdc transposons. Cells highlighted in yellow indicate elements showing the highest similarity to the corresponding contigs. Table S6 - Characteristics of EST containing fragments of carrot DNA transposons DcMaster/Krak, DcSto, and Dc-hAT1.Table S10 - Information about SSR primers tested in this study. Table S11 - Information about SNPs tested in this study.

Format: XLS Size: 372KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 4:

Assembly methods and parameters.

Format: DOC Size: 29KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data