Functional characterization of endogenous siRNA target genes in Caenorhabditis elegans
1 Department of Biosciences, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland
2 Department of Neurobiology, A. I. Virtanen Institute, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland
3 Department of Pharmacology and Toxicology, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland
BMC Genomics 2008, 9:270 doi:10.1186/1471-2164-9-270Published: 3 June 2008
Small interfering RNA (siRNA) molecules mediate sequence specific silencing in RNA interference (RNAi), a gene regulatory phenomenon observed in almost all organisms. Large scale sequencing of small RNA libraries obtained from C. elegans has revealed that a broad spectrum of siRNAs is endogenously transcribed from genomic sequences. The biological role and molecular diversity of C. elegans endogenous siRNA (endo-siRNA) molecules, nonetheless, remain poorly understood. In order to gain insight into their biological function, we annotated two large libraries of endo-siRNA sequences, identified their cognate targets, and performed gene ontology analysis to identify enriched functional categories.
Systematic trends in categorization of target genes according to the specific length of siRNA sequences were observed: 18- to 22-mer siRNAs were associated with genes required for embryonic development; 23-mers were associated uniquely with post-embryonic development; 24–26-mers were associated with phosphorus metabolism or protein modification. Moreover, we observe that some argonaute related genes associate with siRNAs with multiple reads. Sequence frequency graphs suggest that different lengths of siRNAs share similarities in overall sequence structure: the 5' end begins with G, while the body predominates with U and C.
These results suggest that the lengths of endogenous siRNA molecules are consequential to their biological functions since the gene ontology categories for their cognate mRNA targets vary depending upon their lengths.