Opposite GC skews at the 5' and 3' ends of genes in unicellular fungi
Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
BMC Genomics 2011, 12:638 doi:10.1186/1471-2164-12-638Published: 30 December 2011
GC-skews have previously been linked to transcription in some eukaryotes. They have been associated with transcription start sites, with the coding strand G-biased in mammals and C-biased in fungi and invertebrates.
We show a consistent and highly significant pattern of GC-skew within genes of almost all unicellular fungi. The pattern of GC-skew is asymmetrical: the coding strand of genes is typically C-biased at the 5' ends but G-biased at the 3' ends, with intermediate skews at the middle of genes. Thus, the initiation, elongation, and termination phases of transcription are associated with different skews. This pattern influences the encoded proteins by generating differential usage of amino acids at the 5' and 3' ends of genes. These biases also affect fourfold-degenerate positions and extend into promoters and 3' UTRs, indicating that skews cannot be accounted by selection for protein function or translation.
We propose two explanations, the mutational pressure hypothesis, and the adaptive hypothesis. The mutational pressure hypothesis is that different co-factors bind to RNA pol II at different phases of transcription, producing different mutational regimes. The adaptive hypothesis is that cytidine triphosphate deficiency may lead to C-avoidance at the 3' ends of transcripts to control the flow of RNA pol II molecules and reduce their frequency of collisions.