Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

Identification and characterization of transcript polymorphisms in soybean lines varying in oil composition and content

Wolfgang Goettel1, Eric Xia2, Robert Upchurch3, Ming-Li Wang4, Pengyin Chen5 and Yong-Qiang Charles An1*

Author Affiliations

1 USDA-ARS, Midwest Area, Plant Genetics Research Unit at Donald Danforth Plant Science Center, 975 N Warson Rd, St. Louis, MO 63132, USA

2 508 East Stoughton Street, Champaign, IL 61820, USA

3 USDA-ARS, Soybean and Nitrogen Fixation Research, 2417 Gardner Hall, Raleigh, NC 27695, USA

4 USDA-ARS, Plant Genetic Resources Conservation Unit, 1109 Experiment St., Griffin, GA 30223, USA

5 Department of Crop, Soil and Environmental Sciences, University of Arkansas, Fayetteville, AR 72701, USA

For all author emails, please log on.

BMC Genomics 2014, 15:299  doi:10.1186/1471-2164-15-299

Published: 23 April 2014



Variation in seed oil composition and content among soybean varieties is largely attributed to differences in transcript sequences and/or transcript accumulation of oil production related genes in seeds. Discovery and analysis of sequence and expression variations in these genes will accelerate soybean oil quality improvement.


In an effort to identify these variations, we sequenced the transcriptomes of soybean seeds from nine lines varying in oil composition and/or total oil content. Our results showed that 69,338 distinct transcripts from 32,885 annotated genes were expressed in seeds. A total of 8,037 transcript expression polymorphisms and 50,485 transcript sequence polymorphisms (48,792 SNPs and 1,693 small Indels) were identified among the lines. Effects of the transcript polymorphisms on their encoded protein sequences and functions were predicted. The studies also provided independent evidence that the lack of FAD2-1A gene activity and a non-synonymous SNP in the coding sequence of FAB2C caused elevated oleic acid and stearic acid levels in soybean lines M23 and FAM94-41, respectively.


As a proof-of-concept, we developed an integrated RNA-seq and bioinformatics approach to identify and functionally annotate transcript polymorphisms, and demonstrated its high effectiveness for discovery of genetic and transcript variations that result in altered oil quality traits. The collection of transcript polymorphisms coupled with their predicted functional effects will be a valuable asset for further discovery of genes, gene variants, and functional markers to improve soybean oil quality.