This article is part of the supplement: Selected articles from the Eleventh Asia Pacific Bioinformatics Conference (APBC 2013): Genomics

Open Access Open Badges Proceedings

Genomic differences between cultivated soybean, G. max and its wild relative G. soja

Trupti Joshi1234, Babu Valliyodan35, Jeng-Hung Wu6, Suk-Ha Lee78, Dong Xu1234 and Henry T Nguyen235*

Author affiliations

1 Department of Computer Science, University of Missouri, Columbia, MO 65211, USA

2 Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA

3 National Center for Soybean Biotechnology, University of Missouri, Columbia, MO 65211, USA

4 Informatics Institute, University of Missouri, Columbia, MO 65211, USA

5 Division of Plant Sciences, University of Missouri, Columbia, MO 65211, USA

6 Department of Medicine, National Yang-Ming University, Taipei, Taiwan, R.O.C

7 Department of Plant Science and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul 151-921, Korea

8 Plant Genomics and Breeding Institute, Seoul National University, Seoul 151-921, Korea

For all author emails, please log on.

Citation and License

BMC Genomics 2013, 14(Suppl 1):S5  doi:10.1186/1471-2164-14-S1-S5

Published: 21 January 2013



Glycine max is an economically important crop and many different varieties of soybean exist around the world. The first draft sequences and gene models of G. max (domesticated soybean) as well as G. soja (wild soybean), both became available in 2010. This opened the door for comprehensive comparative genomics studies between the two varieties.


We have further analysed the sequences and identified the 425 genes that are unique to G. max and unavailable in G. soja. We further studied the genes with significant number of non-synonymous SNPs in their upstream regions. 12 genes involved in seed development, 3 in oil and 6 in protein concentration are unique to G. max. A significant number of unique genes are seen to overlap with the QTL regions of the three traits including seed, oil and protein. We have also developed a graphical chromosome visualizer as part of the Soybean Knowledge Base (SoyKB) tools for molecular breeding, which was used in the analysis and visualization of overlapping QTL regions for multiple traits with the deletions and SNPs in G. soja.


The comparisons between genome sequences of G. max and G. soja show significant differences between the genomic compositions of the two. The differences also highlight the phenotypic differences between the two in terms of seed development, oil and protein traits. These significant results have been integrated into the SoyKB resource and are publicly available for users to browse at webcite.