Open Access Open Badges Research article

Microarrays for global expression constructed with a low redundancy set of 27,500 sequenced cDNAs representing an array of developmental stages and physiological conditions of the soybean plant

Lila O Vodkin1*, Anupama Khanna17, Robin Shealy1, Steven J Clough18, Delkin Orlando Gonzalez1, Reena Philip19, Gracia Zabala1, Françoise Thibaud-Nissen110, Mark Sidarous1, Martina V Strömvik112, Elizabeth Shoop122, Christina Schmidt2, Ernest Retzel2, John Erpelding3, Randy C Shoemaker3, Alicia M Rodriguez-Huete134, Joseph C Polacco4, Virginia Coryell5, Paul Keim5, George Gong6, Lei Liu6, Jose Pardinas6 and Peter Schweitzer146

Author affiliations

1 Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA

2 Center for Computational Genomics and Bioinformatics, University of Minnesota, Minneapolis, MN, 55455, USA

3 USDA/ARS, Department of Agronomy, Iowa State University, Ames, IA, 50011, USA

4 Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA

5 Department of Biology, Northern Arizona University, Flagstaff, AZ, 86011, USA

6 Keck Center for Comparative and Functional Genomics, University of Illinois, Urbana, IL, 61801, USA

7 Epicentre, 726 Post Road, Madison, WI, 53713, USA

8 USDA/ARS, National Soybean Research Laboratory, University of Illinois, Urbana, IL, 61801, USA

9 Food and Drug Administration, Rockeville, MD, 20850, USA

10 The Institute for Genome Research, 9212 Medical Center Drive, Rockville, MD, 20850, USA

11 Department of Plant Science, McGill University, 2111 Lakeshore, St. Anne-de-Bellevue, QC, H9X3V9, Canada

12 Mathematics and Computer Science, Macalester College, St. Paul, MN, 55105, USA

13 Department of Microbiology, School of Medicine, University of Nevada-Reno, Reno, NV, USA

14 Biotechnology Resource Center, Cornell University, Ithaca, NY, 14853, USA

For all author emails, please log on.

Citation and License

BMC Genomics 2004, 5:73  doi:10.1186/1471-2164-5-73

Published: 29 September 2004



Microarrays are an important tool with which to examine coordinated gene expression. Soybean (Glycine max) is one of the most economically valuable crop species in the world food supply. In order to accelerate both gene discovery as well as hypothesis-driven research in soybean, global expression resources needed to be developed. The applications of microarray for determining patterns of expression in different tissues or during conditional treatments by dual labeling of the mRNAs are unlimited. In addition, discovery of the molecular basis of traits through examination of naturally occurring variation in hundreds of mutant lines could be enhanced by the construction and use of soybean cDNA microarrays.


We report the construction and analysis of a low redundancy 'unigene' set of 27,513 clones that represent a variety of soybean cDNA libraries made from a wide array of source tissue and organ systems, developmental stages, and stress or pathogen-challenged plants.

The set was assembled from the 5' sequence data of the cDNA clones using cluster analysis programs. The selected clones were then physically reracked and sequenced at the 3' end. In order to increase gene discovery from immature cotyledon libraries that contain abundant mRNAs representing storage protein gene families, we utilized a high density filter normalization approach to preferentially select more weakly expressed cDNAs. All 27,513 cDNA inserts were amplified by polymerase chain reaction. The amplified products, along with some repetitively spotted control or 'choice' clones, were used to produce three 9,728-element microarrays that have been used to examine tissue specific gene expression and global expression in mutant isolines.


Global expression studies will be greatly aided by the availability of the sequence-validated and low redundancy cDNA sets described in this report. These cDNAs and ESTs represent a wide array of developmental stages and physiological conditions of the soybean plant. We also demonstrate that the quality of the data from the soybean cDNA microarrays is sufficiently reliable to examine isogenic lines that differ with respect to a mutant phenotype and thereby to define a small list of candidate genes potentially encoding or modulated by the mutant phenotype.