Micro RNAs (miRNAs) are a class of small (~21 nucleotide) non-coding RNAs that recently gained much attention due to their perceived role as master regulators of gene expression in Eukaryotes, responsible for fine tuning gene expression regulation and, in plants, has been shown to be involved in a diverse range of biological processes such as plant development and architecture, flowering, cell differentiation and response to biotic and abiotic stresses . The repertoire of expressed miRNAs differs among cell types, tissues, development, environmental condition, etc . Notwithstanding, the exact function of thousands miRNAs sequences present in miRBase [http://www.mirbase.org/ webcite] is not elucidated. At this point, discovery and profiling of new and conserved miRNAs are critical in the attempt to understand their function and mechanism. Deep sequencing through next generation sequencing is the methodology of choice for this purpose as its ultra high throughput permits a comprehensive interrogation of the small RNA transcriptome, permitting de novo identification and relative quantification of different small RNA species .
Due to its economic importance the Eucalyptus grandis genome has been sequenced by JGI and the annotation of miRNAs is pivotal. In order to provide the first large scale experimental characterization of Eucalyptus miRNAs we performed an Illumina deep sequencing run that allowed us to discover and quantify the miRNA levels in two different tissues – xylem and leaves. Additionally, to get insights of the observed phenotypic differences in wood quality among Eucalyptus species, we characterized the xylem small RNA transcriptome of two different E. globulus individuals and integrated the results to catalog conserved and Eucalyptus specific miRNA gene families.
Materials and methods
Four biological samples were used: xylem from two E. globulus genotypes, xylem and leaf from BRASUZ1 E. grandis, the one currently being sequenced by JGI. Total RNA extraction was performed with CTAB protocol to a total amount of 10 mg per sample . Fraction of small RNAs were barcoded to be sequenced in a single flow cell in Illumina GA II Sequencing System by Fasteris [http://www.fasteris.com webcite]. A computational pipeline was specifically developed to process the deep sequencing data. The pre-processing step cleans the sequences by quality screening, adapter sequence removal, contaminant checking. Cleaned reads were sorted according to size, quantified (tag counting) and used to create an additional set of non-redundant sequences (using uclust). Bowtie was used to map sequences against the 8X E. grandis – BRASUZ1 genotype – genome sequence draft. Mapped positions in the genome were extended by 150 bases to be used as input to predict secondary structure (miRDeep) to test for stem loop structure of miRNA precursors. Northern blot hybridization is being used for experimental validation of some conserved and potentially new Eucalyptus miRNAs sequences.
Results and conclusions
Total number of reads was 6,104,498 ranging from 1,115,404 to 1,766,355 per sample – 36 nt average size. After pre-processing, total number of sequences was reduced to 1,980,958. As expected, read size distribution has two main peaks at 21 and 24 nt. Comparative analysis of size distribution interestingly shows higher abundance of the 24 nt fraction for all samples, being up to 3,75 times higher than 21 nt. The 24 nt small RNAs are predominantly small interfering RNAs (siRNAs) which are involved in RNA-directed DNA methylation resulting in gene and transposon silencing. Putting it all together, reads from four samples resulted in 169,642 unique sequences mapped against the genome. From that, 70,55% had at least one alignment to the genome reported, 23,54% failed to align and 5,91% mapped to multiple loci, indicative of repeat regions.
- Mapping 20-22 nt reads against the reference genome
- Annotation of plant miRNAs
- Quantitative differences
- Experimental validation
This work was supported by the Brazilian Ministry of Science and Technology through CNPq grant 577047/2008-6 and FAP-DF Grant NEXTREE 193.000.570/2009 and EMBRAPA Macroprogram 2 project grant 02.07.01.004.
Plant Molecular Biology Reporter 1993, 11(2):113-116. Publisher Full Text