This article is part of the supplement: Proceedings from the Great Lakes Bioinformatics Conference 2011
Computational and experimental analyses of retrotransposon-associated minisatellite DNAs in the soybean genome
- Equal contributors
1 Program in Bioinformatics Loyola University Chicago, 1032 W. Sheridan Rd, Chicago, IL 60660 USA
2 Department of Biology, Loyola University Chicago, 1032 W. Sheridan Rd, Chicago, IL 60660 USA
3 Present address: Department of Biochemistry and Molecular Biology, Mayo Graduate School, Rochester, MN 55905 USA
BMC Bioinformatics 2012, 13(Suppl 2):S13 doi:10.1186/1471-2105-13-S2-S13Published: 13 March 2012
Retrotransposons are mobile DNA elements that spread through genomes via the action of element-encoded reverse transcriptases. They are ubiquitous constituents of most eukaryotic genomes, especially those of higher plants. The pericentromeric regions of soybean (Glycine max) chromosomes contain >3,200 intact copies of the Gmr9/GmOgre retrotransposon. Between the 3' end of the coding region and the long terminal repeat, this retrotransposon family contains a polymorphic minisatellite region composed of five distinct, interleaved minisatellite families. To better understand the possible role and origin of retrotransposon-associated minisatellites, a computational project to map and physically characterize all members of these families in the G. max genome, irrespective of their association with Gmr9, was undertaken.
A computational pipeline was developed to map and analyze the organization and distribution of five Gmr9-associated minisatellites throughout the soybean genome. Polymerase chain reaction amplifications were used to experimentally assess the computational outputs.
A total of 63,841 copies of Gmr9-associated minisatellites were recovered from the assembled G. max genome. Ninety percent were associated with Gmr9, an additional 9% with other annotated retrotransposons, and 1% with uncharacterized repetitive DNAs. Monomers were tandemly interleaved and repeated up to 149 times per locus.
The computational pipeline enabled a fast, accurate, and detailed characterization of known minisatellites in a large, downloaded DNA database, and PCR amplification supported the general organization of these arrays.