Email updates

Keep up to date with the latest news and content from BMC Proceedings and BioMed Central.

This article is part of the supplement: IUFRO Tree Biotechnology Conference 2011: From Genomes to Integration and Delivery

Open Access Oral presentation

High-throughput targeted SNP discovery using Next Generation Sequencing (NGS) in few selected candidate genes in Eucalyptus camaldulensis

Prasad Suresh Hendre*, Rathinam Kamalakannan, Rathinavelu Rajkumar and Mohan Varghese

Author Affiliations

ITC R&D Centre, Peenya Insdustrial Area, No.3, 1st Main, 1st Phase, Bangalore- 560 058, Karnataka, India

For all author emails, please log on.

BMC Proceedings 2011, 5(Suppl 7):O17  doi:10.1186/1753-6561-5-S7-O17

The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1753-6561/5/S7/O17


Published:13 September 2011

© 2011 Hendre et al; licensee BioMed Central Ltd.

This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Background

The present era of high throughput technologies offer immense promise and innovative applications for SNP discovery and high quality parallel genotyping [1,2]. Using advancements in the next generation sequencing (NGS) technologies, the en masse SNP discovery for targeted genomic regions is possible for eucalypts. The river red gum or Eucalyptus camaldulensis (Ec) is a fast growing, hardy and highly adaptable eucalypt species acclimatized to Indian climatic conditions and these new advancements would aid in developing new tools and techniques for its improvement. In our knowledge, limited efforts have been undertaken to identify SNP markers in eucalypts either by employing RNA sequencing [3] or by using few genes available in the literature [4]. Despite these miniscule efforts, useful SNP markers were discovered in Cinnamoyl CoA Reductase (CCR) gene with potential application [5]. Using the recently released whole genome sequence of E. grandis (Eg), herein we describe targeted SNP discovery in 41 candidate genes by employing Illumina’s 72-bases paired end sequencing technology.

Materials and methods

The DNA was isolated from a SNP discovery panel consisting 96 individuals from a naturally mating Ec population from Australia following standard procedures (modified CTAB method). Twelve primary DNA pools were constituted by mixing equimolar concentrations of eight DNAs @ 10 ng/mL. Forty one genes selected for SNP discovery were identified from Eg genome (http://eucalyptusdb.bi.up.ac.za/gbrowse8x webcite) by employing Arabidopsis TAIR 9 gene IDs. Further the primer pairs were designed to amplify the gene fragments. The individual primary DNA pool was amplified (Veriti-ABI) using Paq DNA polymerase (Agilent Technologies), all amplicons pooled (figure 1), eluted if necessary (EcCRE-AHK4, EcOBP1), precipitated using ethanol and dissolved in TE (0.1).

thumbnailFigure 1. Stategy for hierarchical pooling of 96 DNAs and PCR products for SNP discovery using Illumina NGS platform

A paired end library suitable for 72-bases read length was prepared and sequenced on an Illumina GAIIx sequencer and analyzed using bwa and samtools with appropriate parameters (outsourced to Genotypic Technologies Ltd, Bangalore). The SNP data was adjusted for read depth (1/10th SD) and rare allele frequency (<5%). Further approximate equal frequency (EF) blocks were manually estimated by nearest neighborhood (NN) analysis in MS Excel (MS Office 2007), wherein, a block of NN SNPs having frequency difference of less than 0.02-0.03 was considered as single EF block. Web-based gene prediction tool FGENESH (http://linux1.softberry.com webcite) was used for identifying genic regions such as UTRs, exons and introns with Arabidopsis thaliana gene model.

Results and discussion

Forty one growth and adaptive genes were selected based on literature search [6, TAIR database]. A total of 100.5 kb genomic sequence from Ec genome spread over ~1055 Mbp reads was generated (~94% high quality reads with average read depth 6124). A total of 11,329 SNPs were polymorphic within Ec and 378 SNPs exhibited inter-species polymorphism between Ec and Eg. In addition, 75 insertions and 90 deletions within Ec and eight intra-specific deletions in comparison to Eg were detected. After appropriate corrections as described, the ‘useful’ SNP number reduced to 1,191 which was ~10.5% of the original SNP count (~frequency of 1 per 84.5 bp). Table 1 describes findings from the present analysis of SNPs. A total of 198 putative EF blocks containing 541 SNPs, grossly comparable to LD blocks, with 55, 65 and 34 in exons, introns, exon-intron junctions respectively were detected (rest all were small in numbers) with an average length of ~105 bp (SD: ± 182; range: 1-1234 bp, distribution shown in figure 2; ~3 SNPs/block) and would aid in selection of SNPs. The comparable mean lengths adjusted for the respective amplicon lengths were around 0.014 to 0.016 (SD: ±0.013 to ±0.015) for exons, introns and nongenic regions whereas for intron-exon junctions it was 0.028±0.023, significantly longer than the rest (p=0.03).

Table 1. Results from SNP discovery in 41 candidate genes.

thumbnailFigure 2. Bar graph showing distribution of 198 frequency (EF) blocks according to their length (bp)

Conclusions

Herein, NGS (Illumina) platform was successfully used for identifying ~1,200 SNPs in 41 targeted genes in Ec which has shed important light on quantitative and qualitative distribution of SNPs. In addition, the analysis of EF blocks also provided important guidelines for selection of SNPs for genotyping.

Acknowledgements

The authors acknowledge valuable discussions with Dr. Navin Sharma, Dr. DS Gurumurthy, (both ITC R&D Centre, Bangalore, India) and Dr. BR Thumma, Dr. Simon Southerton, (both CSIRO-Plant Industry, Canberra, Australia) and also the Eucagen website (http://eucalyptusdb.bi.up.ac.za/gbrowse8x) for making the Eg sequence available.

References

  1. Rafalski A: Applications of single nucleotide polymorphisms in crop genetics.

    Curr Opin Plant Biol 2002, 5:94-100. PubMed Abstract | Publisher Full Text OpenURL

  2. Perkel J: SNP genotyping: six technologies that keyed a revolution.

    Nature Methods 2008, 5:447-454. Publisher Full Text OpenURL

  3. Novaes E, Drost DR, Farmerie WG, Pappas GJ Jr, Grattapaglia D, Sederoff RR, Kirst M: High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome.

    BMC Genomics 2008, 9:312. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  4. Kulheim C, Yeoh SH, Maintz J, Foley WJ, Moran GF: Comparative SNP diversity among four Eucalyptus species for genes from secondary metabolite biosynthetic pathways.

    BMC Genomics 2009, 10:452. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  5. Thumma BR, Nolan MF, Evans R, Moran GF: Polymorphisms in Cinnamoyl CoA Reductase (CCR) are associated with variation in microfibril angle in Eucalyptus spp.

    Genetics 2005, 171:1257-1265. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  6. Busov VB, Brunner AM, Strauss SH: Genes for control of plant stature and form.

    New Phytologist 2008, 177:589-607. PubMed Abstract | Publisher Full Text OpenURL