K-SPMM: a database of murine spermatogenic promoters modules & motifs
1 Department of Computer Science, Wayne State University, 5143 Cass Avenue, 431 State Hall, Detroit, MI 48202, USA
2 Applied Genomics Technologies Center, Bioinformatics Group, BioSciences, 5047 Gullen Mall, Detroit, MI 48202, USA
3 Department of Obstetrics and Gynecology, Wayne State University, 275 E. Hancock, Detroit, MI, 48201, USA
4 Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, 5240 Eugene Applebaum Building, 259 Mack Avenue, Detroit, MI 48201, USA
5 Institute for Scientific Computing, Wayne State University, 275 E. Hancock, Detroit, MI, 48201, USA
BMC Bioinformatics 2006, 7:238 doi:10.1186/1471-2105-7-238Published: 3 May 2006
Understanding the regulatory processes that coordinate the cascade of gene expression leading to male gamete development has proven challenging. Research has been hindered in part by an incomplete picture of the regulatory elements that are both characteristic of and distinctive to the broad population of spermatogenically expressed genes.
K-SPMM, a database of murine Spermatogenic Promoters Modules and Motifs, has been developed as a web-based resource for the comparative analysis of promoter regions and their constituent elements in developing male germ cells. The system contains data on 7,551 genes and 11,715 putative promoter regions in Sertoli cells, spermatogonia, spermatocytes and spermatids. K-SPMM provides a detailed portrait of promoter site components, ranging from broad distributions of transcription factor binding sites to graphical illustrations of dimeric modules with respect to individual transcription start sites. Binding sites are identified through their similarities to position weight matrices catalogued in either the JASPAR or the TRANSFAC transcription factor archives. A flexible search function allows sub-populations of promoters to be identified on the basis of their presence in any of the four cell-types, their association with a list of genes or their component transcription-factor families.
This system can now be used independently or in conjunction with other databases of gene expression as a powerful aid to research networks of co-regulation. We illustrate this with respect to the spermiogenically active protamine locus in which binding sites are predicted that align well with biologically foot-printed protein binding domains.