Open Access Research article

Prevalence of single nucleotide polymorphism among 27 diverse alfalfa genotypes as assessed by transcriptome sequencing

Xuehui Li1, Ananta Acharya2, Andrew D Farmer3, John A Crow3, Arvind K Bharti3, Robin S Kramer3, Yanling Wei1, Yuanhong Han1, Jiqing Gou1, Gregory D May3, Maria J Monteros1 and E Charles Brummer1*

Author affiliations

1 The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA

2 Institute of Plant Breeding, Genetics & Genomics, University of Georgia, Athens, GA, 30602, USA

3 National Center for Genome Resources, Santa Fe, NM, 87505, USA

For all author emails, please log on.

Citation and License

BMC Genomics 2012, 13:568  doi:10.1186/1471-2164-13-568

Published: 29 October 2012

Abstract

Background

Alfalfa, a perennial, outcrossing species, is a widely planted forage legume producing highly nutritious biomass. Currently, improvement of cultivated alfalfa mainly relies on recurrent phenotypic selection. Marker assisted breeding strategies can enhance alfalfa improvement efforts, particularly if many genome-wide markers are available. Transcriptome sequencing enables efficient high-throughput discovery of single nucleotide polymorphism (SNP) markers for a complex polyploid species.

Result

The transcriptomes of 27 alfalfa genotypes, including elite breeding genotypes, parents of mapping populations, and unimproved wild genotypes, were sequenced using an Illumina Genome Analyzer IIx. De novo assembly of quality-filtered 72-bp reads generated 25,183 contigs with a total length of 26.8 Mbp and an average length of 1,065 bp, with an average read depth of 55.9-fold for each genotype. Overall, 21,954 (87.2%) of the 25,183 contigs represented 14,878 unique protein accessions. Gene ontology (GO) analysis suggested that a broad diversity of genes was represented in the resulting sequences. The realignment of individual reads to the contigs enabled the detection of 872,384 SNPs and 31,760 InDels. High resolution melting (HRM) analysis was used to validate 91% of 192 putative SNPs identified by sequencing. Both allelic variants at about 95% of SNP sites identified among five wild, unimproved genotypes are still present in cultivated alfalfa, and all four US breeding programs also contain a high proportion of these SNPs. Thus, little evidence exists among this dataset for loss of significant DNA sequence diversity from either domestication or breeding of alfalfa. Structure analysis indicated that individuals from the subspecies falcata, the diploid subspecies caerulea, and the tetraploid subspecies sativa (cultivated tetraploid alfalfa) were clearly separated.

Conclusion

We used transcriptome sequencing to discover large numbers of SNPs segregating in elite breeding populations of alfalfa. Little loss of SNP diversity was evident between unimproved and elite alfalfa germplasm. The EST and SNP markers generated from this study are publicly available at the Legume Information System ( http://medsa.comparative-legumes.org/ webcite) and can contribute to future alfalfa research and breeding applications.