Email updates

Keep up to date with the latest news and content from BMC Plant Biology and BioMed Central.

Open Access Research article

Global RNA sequencing reveals that genotype-dependent allele-specific expression contributes to differential expression in rice F1 hybrids

Gaoyuan Song, Zhibin Guo, Zhenwei Liu, Qin Cheng, Xuefeng Qu, Rong Chen, Daiming Jiang, Chuan Liu, Wei Wang, Yunfang Sun, Liping Zhang, Yingguo Zhu and Daichang Yang*

Author Affiliations

State Key Laboratory of Hybrid Rice and College of Life Sciences, Wuhan University, Luojia Hill, Wuhan, Hubei Province 430072, China

For all author emails, please log on.

BMC Plant Biology 2013, 13:221  doi:10.1186/1471-2229-13-221

The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2229/13/221


Received:25 June 2013
Accepted:9 December 2013
Published:21 December 2013

© 2013 Song et al.; licensee BioMed Central Ltd.

This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

Extensive studies on heterosis in plants using transcriptome analysis have identified differentially expressed genes (DEGs) in F1 hybrids. However, it is not clear why yield in heterozygotes is superior to that of the homozygous parents or how DEGs are produced. Global allele-specific expression analysis in hybrid rice has the potential to answer these questions.

Results

We report a genome-wide allele-specific expression analysis using RNA-sequencing technology of 3,637–3,824 genes from three rice F1 hybrids. Of the expressed genes, 3.7% exhibited an unexpected type of monoallelic expression and 23.8% showed preferential allelic expression that was genotype-dependent in reciprocal crosses. Those genes exhibiting allele-specific expression comprised 42.4% of the genes differentially expressed between F1 hybrids and their parents. Allele-specific expression accounted for 79.8% of the genes displaying more than a 10-fold expression level difference between an F1 and its parents, and almost all (97.3%) of the genes expressed in F1, but non-expressed in one parent. Significant allelic complementary effects were detected in the F1 hybrids of rice.

Conclusions

Analysis of the allelic expression profiles of genes at the critical stage for highest biomass production from the leaves of three different rice F1 hybrids identified genotype-dependent allele-specific expression genes. A cis-regulatory mechanism was identified that contributes to allele-specific expression, leading to differential gene expression and allelic complementary effects in F1 hybrids.

Keywords:
Allele-specific expression; Complementary effects; Differentially expressed genes; Genotype-dependent monoallelic expression; Rice hybrids

Background

Heterosis, or hybrid vigor, refers to the superior biological functions of F1 hybrids compared with their parental homozygous or inbred lines. This phenomenon was first described by Charles Darwin and was later independently rediscovered by George H. Shull and Edward M. East in 1908 [1-3]. Although it is not well understood at the molecular level, heterosis has been exploited over the past half-century in plant and animal breeding. Two classic hypotheses, “dominance” and “overdominance”, have been proposed to explain hybrid vigor. The “dominance” hypothesis proposes that the detrimental allele from one parent is complemented by the superior allele from the other parent, and that the accumulated superior alleles in the F1 hybrids give rise to heterosis. By comparison, the “overdominance” hypothesis signifies that hybrid vigor results from the interaction between alleles brought together in the hybrid [4].

To attempt to discriminate between these hypotheses, extensive studies of gene effects and transcriptomics have been conducted [5-11]. Genetic analyses have revealed the genetic effects of additive, overdominance, dominance, and epistasis, and that interactions between different loci are associated with heterosis in different varieties [5-11]. Several studies analyzing transcription have indicated that DEGs commonly occur in inbred lines as well as between F1 hybrids and their parents. For example, 4–18% of maize genes are differentially expressed in different tissues of the maize inbred lines B73 and Mo17 according to microarray-based analyses [12]. In Arabidopsis seedlings, high-density single nucleotide polymorphism (SNP) analysis showed that 31% of all analyzed genes were differentially expressed between the parental inbred lines [13], while, in another study, 10.6% of genes were differentially expressed in different tissues of the hybrid rice LYP9 and its parents [14].

Recently, high-throughput RNA-sequencing technology revealed that 4-week-old shoots of the 93–11 and Nipponbare rice varieties had 24.0% DEGs, as did their reciprocal hybrids [15]. Moreover, a newly published global survey based on RNA sequencing technology found that approximately 70% of all expressed genes were differentially expressed between the two maize inbred parents B73 and Mo17, and that 42–55% were differentially expressed between the reciprocal F1 and its parents [16]. Although the molecular basis of heterosis has been attributed to the DEGs in the above hybrids, the underlying mechanism(s) causing differential expression remain unknown.

Several studies have shown that only one allele is expressed in heterozygotes [17-20], and monoallelic expression or an imbalance in heterozygote allelic expression has been extensively studied in humans and other mammals [21-23]. Transcription profile analyses have indicated that monoallelic expression could be caused by X-chromosome silencing, autosomal imprinting, or random events. Some studies with vegetative tissues from maize F1 hybrids identified several genes exhibiting allele-specific expression (ASE) [12,24,25], which differed markedly between the different F1 hybrids and was altered in response to environmental stress. This could contribute to heterosis.

The objectives of the present study were to explore global ASE in hybrid rice and to reveal the mechanism of differential expression in F1 hybrids using RNA sequencing. Three elite rice varieties were chosen that met the breeding objectives from different periods in China, Guangluai #4 (GL, 1970–1980s), Teqing (TQ, 1980–1990s), and 93–11 (1990s to present), plus their F1 hybrids, which we show have different levels of heterosis. Two F1 hybrids, GL × TQ and GL × 93-11, exhibited high heterosis, and the third, 93-11 × TQ, low heterosis. To obtain sufficient SNPs to distinguish the maternal and paternal alleles in F1 hybrids, the genomes of the three parents were re-sequenced. To identify more SNPs for further ASE analysis, nuclear RNA was extracted from leaves of the three F1 hybrids and their parents and subjected to Illumina RNA-Seq technology. We identify a global ASE profile that reveals a potential mechanism for an increased biomass-based, grain-yield heterosis.

Results

Global ASE analysis by RNA sequencing

To obtain sufficient SNPs for ASE analysis, we achieved a rice genome coverage of 17.7–27.7 fold to satisfy the minimum requirement of obtaining more than 90% SNPs [26]. A total of 76.2–119.0 million reads (100 bp per read) from three rice varieties serving as parents for the F1s were obtained using Illumina DNA-Seq technology (Additional file 1: Table S1), and 375,744–411,571 SNPs were detected between each parent with 98.0% accuracy (Additional file 2: Figure S1A, Additional file 3: Table S2). A total of 89.5–114.1 million reads (90 bp per read) were obtained from analogous tissue from the three F1 hybrids. Analysis disclosed that 29,064–29,928 genes at the secondary branch differentiation stage were expressed in the three F1 hybrids and their parents (Additional file 1: Table S1). The expression levels of 30 genes in the F1 hybrids and their parents were also determined, and quantitative real time polymerase chain reaction (qRT-PCR) and RNA-sequencing were shown to be highly correlated in terms of their analyses (expression level correlation coefficient r = 0.92-0.99 (Additional file 4: Table S3)). We found that 41,416–43,685 of the SNPs located in gene bodies had an average read coverage of 8.6–10.9 in the F1 hybrids. These SNPs were available for allelic expression analysis (Additional file 2: Figure S1B and Additional file 5: Table S4). Of these, 9,752–10,818 genes accounting for 33.5–36.1% of the total number of expressed genes containing SNPs could be used for distinguishing the alleles (Additional file 6: Figure S2). A total of 3,627–3,824 genes met the criterion of at least 10 SNP reads, so could be used for further ASE profile analyses.

Additional file 1: Table S1. Sequencing depth.

Format: DOCX Size: 16KB Download fileOpen Data

Additional file 2: Figure S1. SNP locations (A) The location of SNPs detected between parents. (B) The location of SNPs used to evaluate allele-specific expression in each F1 hybrid.

Format: TIFF Size: 1.8MB Download fileOpen Data

Additional file 3: Table S2. Confirmed SNPs.

Format: DOCX Size: 25KB Download fileOpen Data

Additional file 4: Table S3. List of genes used for expression level confirmation.

Format: XLSX Size: 16KB Download fileOpen Data

Additional file 5: Table S4. Read coverage in the ASE analysis.

Format: DOCX Size: 15KB Download fileOpen Data

Additional file 6: Figure S2. Number of expressed genes containing SNPs in different F1 hybrids.

Format: TIFF Size: 806KB Download fileOpen Data

Through these ASE profile analyses, the F1 hybrids were classified into three categories: monoallelic expression, in which only one allele from either the maternal or paternal parent is expressed; preferential allelic expression, in which expression levels differ by more than two-fold between alleles; and biallelic expression, in which expression levels vary by less than two-fold between alleles (Additional file 7: Figure S3). We found that 3.4–3.9% of the genes were monoallelic expression, 23.5–24.2% were preferential allelic expression, and 72.0–73.0% were biallelic expression (Figure 1A-C). Further analysis of paternally and maternally derived read coverage in each heterozygote showed that the two parents contributed equally to the F1 hybrids. Our results are consistent with those from Arabidopsis embryos [27], and suggest that no obvious parent-of-origin effect occurred in the vegetative tissues of the rice hybrids (Figure 1A–C and Additional file 8: Figure S4).

Additional file 7: Figure S3. A model of the differential allelic expression pattern found in hybrid rice: biallelic expression, preferential allelic expression, and monoallelic expression.

Format: TIFF Size: 864KB Download fileOpen Data

thumbnailFigure 1. Overall allele-specific expression (ASE) profiles in three F1 populations. The three types of ASE genes and their proportions detected in GL × TQ (A), GL × 93-11 (B), and 93-11 × TQ (C).

Additional file 8: Figure S4. The total read coverage of each allele from the two parents in the three F1 populations.

Format: TIFF Size: 863KB Download fileOpen Data

A cis-regulation mechanism and genotype-dependent monoallelic expression

To understand the relationship between gene expression in the parents and ASE in the F1 hybrids, the correlation between these parameters was determined for 3,627–3,824 genes, yielding correlation coefficients in the range of 0.70–0.76 (Figure 2A, P < 2.2E-16). A higher correlation coefficient was obtained from the same analysis with 2,026 ASE genes in the F1 hybrids (Figure 2A, Pearson’s r > 0.80, P < 2.2E-16). These results suggest that a cis-regulatory mechanism is occurring, that is, if the allele is transcribed in the parent it is also transcribed in its F1 hybrids, and if not in the parent then not in its F1 hybrids. Our data also indicate that only 5.4–19.3% of gene expression in F1 hybrids is non-additive. Trans-acting regulation may thus also contribute to the regulation of gene expression in rice hybrids, but would not have a major effect.

thumbnailFigure 2. The cis-regulatory mechanism and genotype-dependent monoallelic expression. (A) Correlation analysis between genic expression in the parental generation and allelic expression in the F1 generation of the three hybrids. Red, allelic-specific expressed genes (ASE); blue, biallelic expressed genes. (B) The number of variety-specific and commonly expressed monoallelic expressed genes in the three F1 hybrids. (C) Example of a monoallelic expression gene confirmed by RT-PCR sequencing of reciprocal F1 crosses. (D) Confirmation of genotype-dependent monoallelic expression patterns in the three F1 hybrids showing the origin of the alleles. (E) Example of a preferential allelic expression gene confirmed by RT-PCR sequencing of reciprocal F1 crosses. (F) Example of a biallelic expression gene confirmed by RT-PCR sequencing of reciprocal F1 crosses.

We identified 413 monoallelic expression genes in the three F1 hybrids (143 from GL × TQ, 129 from GL × 93-11, and 141 from 93-11 × TQ) (Additional file 9: Table S5). Of these, 108 were common to two F1 hybrids, but none were common to all three (Figure 2B). We randomly chose 134 monoallelic expression genes, accounting for 32.4% of the total, from the three hybrids and used reverse-transcription (RT)-PCR sequencing to verify a reliability of 91.8% (123/134) (Figure 2C, Additional file 10: Table S6). Through further investigation of allelic expression patterns in three reciprocal crosses, GL × 93-11 and 93-11 × GL, GL × TQ and TQ × GL, and 9311 × TQ and TQ × 93-11, we detected 109 monoallelic expression genes and 14 preferential allelic expression genes and found that all monoallelic expression and preferential allelic expression genes tested exhibited a genotype-dependent expression pattern, while 17 biallelic expression genes showed no difference between reciprocal crosses (Figure 2C–F, Additional file 11: Table S7). This shows that, regardless of the paternal or maternal origin in the reciprocal crossings, monoallelic expression and preferential allelic expression genes always express the allele from a given parent (Figure 2C and E). Combining our results with previous observations from maize [24], we suggest that a hitherto overlooked type of monoallelic expression occurs in eukaryotic organisms.

Additional file 9: Table S5. List of all monoallelically expressed genes.

Format: DOCX Size: 50KB Download fileOpen Data

Additional file 10: Table S6. List of confirmed monoallelically expressed genes.

Format: DOCX Size: 41KB Download fileOpen Data

Additional file 11: Table S7. List of confirmed preferential allelic expression and biallelic expression genes.

Format: DOCX Size: 24KB Download fileOpen Data

ASE results in transcriptome divergence in the F1 hybrids

Previous studies have demonstrated that higher levels of heterosis are associated with greater differences between the agronomic and/or metabolic traits of parents [28-30], and that DEGs (fold change >2.0, false discovery rate (FDR) <0.05) between F1 hybrids and their parents are among the major factors leading to heterosis [14,31,32]. To ascertain the extent to which monoallelic expression and preferential allelic expression genes contribute to heterosis, we dissected the global DEGs between F1 hybrids and their parents. The total number of DEGs in the three hybrids was correlated with heterosis level of both fresh and dry mass (r > 0.99, P < 0.03; Figure 3A–B).

thumbnailFigure 3. Contributions of the three allelic expression types of genes to the differentially expressed genes (DEGs). (A) The mid-parent heterosis level of total biomass at the secondary branch differentiation stage exhibited by the three F1 hybrid populations. Heterosis was evaluated using the fresh weight and dry weight of single plants from each F1 hybrid generation and their parents. (B) The preferential allelic expression genes compared between each F1 hybrid and its parents. (C) The percentages of DEGs represented by the genes of each allelic expression type. (D) The contributions of monoallelic expression, preferential allelic expression, and biallelic expression genes to the total genes and DEGs in the three F1s. (E) The proportions of genes with monoallelic expression, preferential allelic expression, and biallelic expression profiles in the F1 generation compared with their parents, sub-grouped according to fold differences in expression level. (F) The proportions of monoallelic expression, preferential allelic expression, and biallelic expression genes expressed in the F1 generation and one of the two parents but not expressed in the other parent.

To explore which mechanism(s) create DEGs in heterozygotes, we found that 95.1% of monoallelic expression genes, 50.3% of preferential allelic expression genes, and 30.4% of biallelic expression genes are DEGs (Figure 3C). Surprisingly, we discovered that the monoallelic expression and preferential allelic expression genes, comprising 27.5% of the total analyzed genes, amounted to 42.7% of the DEGs (Figure 3D). More specifically, the monoallelic expression genes, accounting for 3.7% of the total analyzed genes, gave rise to 10.0% of the DEGs, the preferential allelic expression genes (23.8%) amounted to 32.4%, and the biallelic expression genes (72.3%) were composed of only 57.6% of the DEGs (Figure 3D). By determining contributions to DEGs between F1 hybrids and their parents, we found that 57.2% of DEGs from monoallelic expression, 11.4% from preferential allelic expression, and 3.9% from biallelic expression exhibited a considerably greater than 10-fold difference. By contrast, 79.7% of DEGs from biallelic expression, 68.6% from preferential allelic expression, and 32.1% from monoallelic expression genes displayed a less than four-fold difference between the F1 and their parents (Figure 3E). The 71.6–83.5% of the genes that were expressed in the F1s but silenced in one of the two parents showed monoallelic expression patterns (Figure 3F). These results demonstrate that the majority of DEGs (>10 fold difference) are attributable to ASE, and that monoallelic expression genes in particular play an important role in gene expression divergence between F1 hybrids and their parents.

Complementary effects are mainly contributed by ASE genes

Because the presence of a cis-regulatory mechanism of allelic expression would be in accordance with the “dominance” hypothesis, we analyzed the transcriptomic profiles of all ASE genes in the three F1 hybrids and their parents. A large number of genotype-dependent monoallelic expression genes in F1 hybrids would also be in accordance with the “dominance” hypothesis. The results showed that an average of 51.6% (53.1% in GL × TQ, 55.0% in GL × 93-11, and 46.8% in 93-11 × TQ) of genes expressed in one parent and non-expressed in the other were categorized as monoallelic expression genes in F1 hybrids (Figure 4A class IV and Additional file 12: Table S8), whereas 30.2% (29.4, 27.1, and 34.0% in GL × TQ, GL × 93-11, and 93-11 × TQ, respectively) of genes expressed at low levels in either parent, but enhanced in F1 hybrids, were categorized as monoallelic expression genes (Figure 4A class III and Additional file 12: Table S8). Most of both types of monoallelic expression genes exhibited a mid-parent expression level (Additional file 12: Table S8). Therefore, the alleles only expressed in the F1 were those expressed in the inbred parent, while the alleles silenced in the inbred parent were also silenced in F1. This means that the total expression level of the gene in the F1 is one half that in the parent, which is equivalent to a mid-parent expression level (Figure 4A class IV and Additional file 12: Table S8). This dosage effect implies that a cis-regulatory mechanism is acting.

thumbnailFigure 4. Complementary effects of allele-specific expression genes contributing to transcriptome optimization of F1 hybrids. The left two lanes of each panel show the allelic expression patterns observed in the F1s, whereas the right three lanes compare the expression levels of these genes in the hybrids (lane 4) to those in the parents (lanes 3 and 5). The genes in groups I, II, III, and IV are those with no difference between the parents, a 2- to 10-fold difference between the parents, a greater than 10-fold difference between the parents, and expression in only one parent, respectively. (A) The expression levels of monoallelic expression genes in the F1 hybrids and their parents. (B) The expression levels of preferential allelic expression genes in the F1 hybrids and their parents. (C) The expression levels of biallelic expression genes in the F1 hybrids and their parents. The short bands on the same horizontal line indicate the same gene. The expression level of each gene was normalized by log10. The vertical bars on the right correlate color in the panels with relative levels of transcription.

Additional file 12: Table S8. The expression levels of genes with monoallelic expression in the F1 and parental populations.

Format: DOCX Size: 45KB Download fileOpen Data

Moreover, we found that more monoallelic expression genes were downregulated than upregulated in F1 (Additional file 13: Table S9), and that 9.7% of preferential allelic expression genes exhibited the same patterns as monoallelic expression genes in the F1 hybrids (Figure 4B classes III and IV). The analysis of genes within the categories of activated and enhanced expression in F1 hybrids found complementary effects of superior alleles from both parents with average values of 73.8% and 93.6% for class III and class IV genes, respectively. By contrast, a biallelic expression pattern was exhibited by only 26.2% and 6.4% in class III and class IV genes, respectively (Figure 4C and Additional file 14: Table S10). The proportion of genes with complementary effects in GL × TQ and GL × 93-11 was higher than that in 93-11 × TQ (7.8% and 7.5% versus 4.7%, respectively), which was consistent with the heterosis level of these F1 hybrids (Additional file 14: Table S10). Our data imply that ASE genes, most notably monoallelic expression genes, are the main contributors to allelic complementary effects in hybrid rice.

Additional file 13: Table S9. The expression changes of monoallelic expression genes between F1 and parents.

Format: DOCX Size: 17KB Download fileOpen Data

Additional file 14: Table S10. Allelic expression genes classified by the relative expression level of parents.

Format: DOCX Size: 19KB Download fileOpen Data

ASE genes have diverse biological functions in F1 hybrids

To ascertain the molecular and biological functions of monoallelic expression and preferential allelic expression genes, we performed gene ontology (GO) analysis. Four GO terms for molecular functions, nucleotide binding, receptor activity, protein binding, and kinase activity, were commonly found among both monoallelic expression and preferential allelic expression genes (Table 1 and Additional file 15: Tables S11 and Additional file 16: Table S12). These functions indicate that monoallelic expression genes are mainly involved in protein modification, signal transduction, and response to endogenous stimulus pathways. A wider diversity of functions, including biosynthesis, morphogenesis, and carbohydrate metabolism, was found for preferential allelic expression genes (Table 1 and Additional file 17: Tables S13 and Additional file18: Table S14), suggesting that ASE genes have important roles in biosynthesis, development, and their regulation. To distinguish between the enrichment of monoallelic expression and preferential allelic expression gene functions and that of all genes for ASE analysis, we performed GO analysis of 3,627–3,824 genes in three F1 hybrids. Limited GO terms were common to monoallelic expression and preferential allelic expression genes in different F1 hybrids (Additional files 19: Figures S5 and Additional file 20: Figure S6), which might be a consequence of specific monoallelic expression and preferential allelic expression genes occurring in different F1 hybrids, suggesting that allele-specific expression genes have different roles in different genomic backgrounds.

Table 1. The common functions of monoallelic expression and preferential allelic expression genes in the three rice hybrids

Additional file 15: Table S11. Molecular function of monoallelic expression genes.

Format: DOCX Size: 18KB Download fileOpen Data

Additional file 16: Table S12. Molecular function of preferential allelic expression genes.

Format: DOCX Size: 24KB Download fileOpen Data

Additional file 17: Table S13. Biological function of monoallelic expression genes.

Format: DOCX Size: 20KB Download fileOpen Data

Additional file 18: Table S14. Biological function of preferential allelic expression genes.

Format: DOCX Size: 23KB Download fileOpen Data

Additional file 19: Figure S5. Enriched biological functions in different groups of genes in GL × TQ (A), GL × 93-11 (B), and 93-11 × TQ (C).

Format: TIFF Size: 1013KB Download fileOpen Data

Additional file 20: Figure S6. Enriched molecular functions in different groups of genes in GL × TQ (A), GL × 93-11 (B), and 93-11 × TQ (C).

Format: TIFF Size: 894KB Download fileOpen Data

Discussion

Previous studies of transcriptome analyses related to heterosis have mainly been conducted using microarray analyses, and RT-PCR and RNA-sequencing technology [15,31,33]. Here, we combined RNA-sequencing with DNA re-sequencing technology to establish ASE assays and achieved a 20-fold coverage of rice genome re-sequencing data to identify SNPs. A strict statistical cut-off for SNP calling enabled fully quantitative analyses of both overall and allele-specific gene expression profiles to be obtained for rice leaves at the stage of secondary branch differentiation. These data were verified by PCR-sequencing and RT-PCR sequencing using analogous materials planted in the following year. This developing stage is important as grain yield is directly correlated with the biomass established early in vegetative growth [34]. SNP accuracies of 98.0% and 91.8% were confirmed, indicating the reliability of both genome and transcriptome data, respectively.

Monoallelic expression genes have been studied for nearly a half century in humans and other mammals [17-20,35,36]. Most cause X-chromosome silencing, autosomal imprinting, or random events [17-20,35,36], and contribute to dose-dependent gene expression, immune responses, and disease biogenesis, including several types of cancers [17-20,35-37]. ASE and regulation mechanisms have also been studied in humans and other animals [23,38,39]; however, by contrast, monoallelic expression is poorly understood in higher plants. Different studies have reported a high proportion of ASE genes in maize hybrids (50% and 60%) with cis-regulatory effects underlying the ASE [24,40], compared with a more limited number of monoallelic expression genes [25]. Other studies using F1 hybrids of japonica-indica, maize, and Arabidopsis focused on endosperm-localized genes and identified more than 100 imprinted genes [41-44]. However, no large-scale ASE analysis has previously been carried out in rice hybrids.

We identified 413 monoallelic expression and 2,659 preferential allelic expression genes in the three F1 hybrids via a global transcriptome analysis. Of the total genes analyzed, 3.4–3.9% exhibited monoallelic expression and 23.5–24.2% exhibited preferential allelic expression patterns. The proportion of ASE genes, moreover, did not differ significantly in the three F1 hybrids, which is similar to a previous report in maize hybrids [24]. Importantly, our results indicated that all monoallelic expression and preferential allelic expression genes tested exhibited genotype-dependent expression patterns in reciprocal crosses, which contrasts with the findings of monoallelic expression genes in humans and other mammals [21,45]. The observed genotype-dependent ASE in the vegetative tissue of hybrid rice could represent a common mechanism of allelic complementary effects in hybrids, and show the importance of parental genotype in both crossbreeding and hybrid breeding. Given the number of genes with genotype-dependent monoallelic expression involved in a wide range of GO categories that play important biological functions, many potential opportunities exist for them to contribute to DEGs and to produce the diverse phenotypes of the F1 hybrids.

In the present study, the genotype-dependent ASE genes might confer a fitness advantage to the heterozygous state relative to either of their homozygous parents. As most monoallelic expression genes were silenced or suppressed in one of the parents, we speculate that genotype-dependent monoallelic expression might be the consequence of artificial parent selection by breeders. To meet the breeding objectives, “superior alleles” are accumulated in the elite parent for long-term selection (Figure 4A). Such alleles could maintain allelic diversity in different varieties and enlarge the allele difference in the rice germplasm. Therefore, our findings of allelic complementary effects in F1 hybrid rice could guide the selection of elite parents through complementing a variety of the superior alleles.

Previous studies have indicated that differential gene expression is common between F1 and parents, and that it is a major contributor to heterosis at the transcription level [14,31]. For example, hundreds of differentially expressed genes were detected at different developmental stages between the elite hybrid rice LYP9 and its parents [14], and analogous results were obtained from studies with Arabidopsis and maize [28,32,46,47]. Our study also found that the percentage of DEGs between F1 hybrids and parents exactly correlated with the heterosis level of each F1 hybrid (Figure 3A and B).

Such results do not divulge, nonetheless, how the alleles from inbred lines become the DEGs in heterozygotes, although the mechanism behind this is revealed to some extent by genome-wide, ASE analysis using high-throughput RNA sequencing. Moreover, in the present study, the most significant DEGs between the F1 and parent were expressed in the F1 but silenced in one of the parents, accounting for 93.7% of the DEGs associated with ASE. Of these, 15.8% showed preferential allelic expression and 77.8% were monoallelic expression. Altogether, 27.5% of the ASE genes contributed to 42.4% of the total number of DEGs. Although the proportion of ASE genes was similar in the three F1 hybrids, the total numbers of DEGs differed, with fewer detected in 93-11 × TQ (Figure 3B) with its lower heterosis level (Figure 3A) and more found in GL × TQ and GL × 93-11 with their higher heterosis levels. The same results were also found between their parents, suggesting that transcriptome divergence in F1 hybrids is attributed to transcriptome divergence between parents. Crucially, increased parent allelic variation could be an important strategy for maintaining higher heterosis levels in hybrid rice breeding programs through enlarging the transcriptome diversity between parents, which have accumulated different sets of superior alleles.

A recent study has provided molecular evidence for a single gene model to support the “overdominance” hypothesis of heterosis in tomato hybrids [48]. Because of technological limitations, however, the maternal and paternal alleles in the F1 hybrids could not be distinguished effectively. The recent development of high-throughput sequencing technology provides the opportunity to study ASE in heterozygotes. Many genes have exhibited additive expression patterns in F1 hybrids in previous studies, and complementary effects at the gene expression level have been reported in hybrid maize [16]. Our data derived from global allelic expression profiles extend this result to hybrid rice, and also reveal the mechanism of DEG formation in heterozygotes. Our observed high correlation (>0.7) between allelic expression in F1 hybrids and gene expression in parental lines indicates that a cis-regulated mechanism plays an essential role in allelic expression. Allelic complementation effects, moreover, can be the outcome of a cis-regulatory mechanism mainly contributed to by ASE genes.

The findings of the present study support the “dominance” hypothesis for indica hybrid rice varieties and reveal that the consequences of complementation, primarily by genotype-dependent monoallelic expression and preferential allelic expression alleles, are an accumulation of “superior” alleles that confer monoallelic expression and preferential allelic expression genes in F1 hybrids. The expression of these “superior” alleles offers an opportunity to optimize the transcriptomes that give rise to heterosis in F1 hybrids. Although our results revealed that allelic complementary effects played a major contribution to gene expression in hybrid rice and support the “dominance” hypothesis, they do not exclude the contribution of different mechanisms to heterosis in hybrid rice and other crops.

Conclusions

Allelic expression profiles in hybrid rice determined by RNA-sequencing technology demonstrated a type of genotype-dependent monoallelic expression genes in plants. DEGs between parents and F1 hybrids were mainly attributable to ASE genes, which gave rise to the observed allelic complementary effects in F1 hybrids.

Methods

Plant material and phenotypic analysis

Reciprocal crosses were made between the three indica varieties, Guangluai #4 (GL), 93–11, and Teqing (TQ), at Hainan Island, China in the spring of 2009. The six reciprocal F1 hybrids were planted together with the three parents in Wuhan, China in the summer of 2009. Triplicate plots, each containing 30 individuals, were planted for all nine genotypes. Heterosis levels were evaluated by measuring the fresh and dry biomass of the above-ground parts of the plants at the secondary branch differentiation stage, which is critical for determining the highest biomass and maximum grain yield. The mid-parent heterosis level was calculated as described by Ma et al. [49]. Analogous leaves from the F1 hybrids and parents at the same development stages in 2010 were used to verify gene and allelic expression and genotype-dependent monoallelic expression.

Nuclear RNA extraction

The second fully expanded leaves were harvested at the secondary branch differentiation stage, immediately frozen in liquid nitrogen and stored at -80°C. Material from the triplicate plots was pooled at harvest. Nuclei were isolated from ~10 g of frozen leaves using the Plant Nuclei Isolation/Extraction Kit (Sigma, St. Louis, MO, USA). Total hnRNA was extracted from nuclei using Trizol (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s instructions, and treated with RNase-free DNase I (New England Biolabs, Ipswich, MA, USA) to remove any contaminating genomic DNA.

Library construction

The Illumina mRNA-Seq Sample Prep Kit (Illumina, San Diego, CA USA) was used to prepare the sequencing library with 3 μg of nuclear RNA. Fragmentation buffer in the kit was added directly to hnRNA to produce short fragments of 200–700 bp, which served as the templates for first-strand cDNA synthesis using random hexamers. Second-strand cDNA was synthesized followed the protocol described in the kit and was purified using a QIAquick PCR Extraction Kit (Qiagen, Valencia,CA USA) and eluted in elution buffer (EB). The short fragments were then ligated to sequencing adapters. Suitable fragments of approximately 200 bp were selected as templates for amplification in a MyCycler PCR instrument (Bio-Rad, Hercules, CA USA) with the following program: denaturation at 98°C for 30 s followed by 15 cycles of 98°C for 10 s, 65°C for 30 s, and 72°C for 30 s plus a terminal hold at 72°C for 5 min. The samples were then purified using the QIAquick PCR Purification Kit according to the manufacturer’s protocol and eluted in 30 μL of EB. A 1-μL aliquot of the construct was loaded onto an Agilent Technologies 2100 Bioanalyzer using the Agilent DNA 1000 Chip Kit (Agilent, Santa Clara, CA USA). After verifying the size and purity of the DNA fragments, the library was sequenced using an Illumina GA II x platform by BGI (Shenzhen, China).

The DNA Illumina TruSeq DNA sample preparation kit was used to prepare a genomic DNA library according to the manufacturer’s protocol. Contaminating RNA was removed by treating with RNase A. The DNA samples were sent to the Beijing Novo Gene Company (Beijing, China) for sequencing using an Illumina Hiseq 2000 platform.

Read alignment

The raw reads were filtered before data analysis by removing reads consisting of adaptors only, those with greater than 10% unknown bases, and those in which more than half of the bases gave a quality score of less than 5.0. The remaining (clean) reads were mapped to the reference genome of Japonica variety Nipponbare (http://www.gramene.org/ webcite) using SOAP2 software [50]. Mismatches of no more than two bases were allowed in the alignment. We used the reads per kb per million reads (RPKM) method to calculate unique gene expression levels [51].

SNP identification

SOAP2 was used to align each read to the Nipponbare reference genome (http://www.gramene.org/ webcite), with no more than three mismatches to the candidate SNPs permitted for each read [50]. This method of SNP calling has been previously described [46]. A statistical model based on Bayesian theory and the Illumina quality system was used to calculate the probability of each possible genotype at every position from the alignment of reads to the reference genome. Six criteria were set to filter out unreliable SNPs: 1) the read quality value must be no lower than 20, 2) the SNPs must be at least 5 bp from each other, 3) the SNP must be represented by at least four reads, 4) the sequencing depth must be less than 10,000, 5) the SNP must be more than 5 bp distant from an intron-exon junction and 6) the approximate copy number of the flanking sequences must be less than four. SNP sets from each biological replicate of GL versus TQ, GL versus 93–11, and 93–11 versus TQ were obtained and used for further allele-specific analysis.

ASE analysis

The paternal and maternal alleles expressed in the F1 hybrid transcriptomes were distinguished by their SNP nucleotype. The expression level was calculated based on at least 10 reads of single genes. The expression level from a paternal or maternal allele was calculated based on the number of reads for the given allele divided by the total number of reads for the SNP. When only one allele was expressed, the gene was categorized as showing monoallelic expression. When the allele expression level was biased toward one parent by more than two-fold, the gene was categorized as showing preferential allelic expression. When the two alleles were expressed equally (less than two-fold difference), the gene was categorized as showing biallelic expression. In our computer analysis, the relative allele expression value of “0”, which occurred when an allele was not expressed, could not be computed. Therefore, to enable calculations, the “0” was replaced by 0.001.

SNP confirmation and validation of allelic expression

For SNP validation, primers flanking the SNP sites were designed to amplify 300–800 bp fragments. PCR amplification was performed using reaction mixtures containing 10 ng of cDNA template and 5 μM primer. PCR was performed using the MyCycler PCR system with the following parameters: denaturation at 95°C for 5 min followed by 30 cycles of 95°C for 30 s, 50–55°C for 30 s, and 72°C for 30 s plus a terminal hold at 72°C for 5 min. PCR products were purified using the Axy PCR Cleanup Kit (AxyGEN Bioscience, Union city, CA USA) and sequenced using an ABI 3730xI machine. The resulting sequences were compared with reference SNPs detected by genome sequencing.

For monoallelic expression and preferential allelic expression validation, cDNA and gDNA from each F1 sample were used as PCR templates, and the sequences derived from the genome and transcriptome were compared. For preferential allelic expression and biallelic expression validation, cDNA from reciprocal F1 hybrids was used as PCR templates. The sequencing results for cDNA from reciprocal F1 hybrids were compared to detect parent-of-origin effects.

GO and statistical analyses

GO analysis was performed using the open-source MAS3 database (http://bioinfo.capitalbio.com/mas3/ webcite). A threshold of a two-fold change in gene expression levels and a FDR of <0.05 were used to identify differentially expressed genes. The P-values and the FDRs of differentially expressed genes were calculated as described [52].

Gene expression level validation

To validate the gene expression level, 30 randomly selected genes with different expression levels were verified by quantitative RT-PCR as described by Wang et al. [53].

Abbreviations

ASE: Allele-specific expression; DEG: Differentially expressed gene; FDR: False discovery rate; GO: Gene ontology; RPKM: Reads per kb per million reads; RT-PCR: Reverse transcription polymerase chain reaction; SNP: Single nucleotide polymorphism.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

GYS performed the most of the experiments; ZBG performed part of data analysis; GYS, ZWL, RC and CL did the monoallelic confirmation; DMJ performed all field experiments; ZWL, XFQ, WW, LPZ and QC helped GYS performed crossing and analysis of phenotypes; DCY and YGZ supervised this study; DCY and GYS designed the experiments and wrote the manuscript. All the authors discussed the results and contributed to the manuscript. All authors read and approved the final manuscript.

Acknowledgements

We are grateful for BGI at Shenzhen carried out RNA sequencing and data analysis, Beijing Novo Gene Company conducted DNA re-sequencing. This work was supported by National Basic Research Program of China (grant No. 2011CB100102). We are thankful to Prof. Penny von Wettstein-Knowles (Copenhagen University) for critical reading of the manuscript.

References

  1. Shull GH: The composition of a field of maize.

    Am Breeders Assoc Rep 1908, 4:296-301. OpenURL

  2. Darwin CR: The effects of cross and self fertilization in the vegetable kingdom. London: John Murray; 1876. OpenURL

  3. East EM: Inbreeding in corn.

    Conn Agric Exp Sta Rep 1907, 1908:419-428. OpenURL

  4. Crow JF: Alternative hypotheses of hybrid vigor.

    Genetics 1948, 33(5):477-487. OpenURL

  5. Zhou G, Chen Y, Yao W, Zhang C, Xie W, Hua J, Xing Y, Xiao J, Zhang Q: Genetic composition of yield heterosis in an elite rice hybrid.

    Proc Natl Acad Sci U S A 2012, 109(39):15847-15852. OpenURL

  6. Lariepe A, Mangin B, Jasson S, Combes V, Dumas F, Jamin P, Lariagon C, Jolivot D, Madur D, Fievet J, et al.: The genetic basis of heterosis: multiparental quantitative trait loci mapping reveals contrasted levels of apparent overdominance among traits of agronomical interest in maize (Zea mays L.).

    Genetics 2012, 190(2):795-811. OpenURL

  7. Meyer RC, Kusterer B, Lisec J, Steinfath M, Becher M, Scharr H, Melchinger AE, Selbig J, Schurr U, Willmitzer L, et al.: QTL analysis of early stage heterosis for biomass in Arabidopsis.

    Theor Appl Genet 2010, 120(2):227-237. OpenURL

  8. Li L, Lu K, Chen Z, Mu T, Hu Z, Li X: Dominance, overdominance and epistasis condition the heterosis in two heterotic rice hybrids.

    Genetics 2008, 180(3):1725-1742. OpenURL

  9. Hua J, Xing Y, Wu W, Xu C, Sun X, Yu S, Zhang Q: Single-locus heterotic effects and dominance by dominance interactions can adequately explain the genetic basis of heterosis in an elite rice hybrid.

    Proc Natl Acad Sci U S A 2003, 100(5):2574-2579. OpenURL

  10. Luo LJ, Li ZK, Mei HW, Shu QY, Tabien R, Zhong DB, Ying CS, Stansel JW, Khush GS, Paterson AH: Overdominant epistatic loci are the primary genetic basis of inbreeding depression and heterosis in rice. II. Grain yield components.

    Genetics 2001, 158(4):1755-1771. OpenURL

  11. Li ZK, Luo LJ, Mei HW, Wang DL, Shu QY, Tabien R, Zhong DB, Ying CS, Stansel JW, Khush GS, et al.: Overdominant epistatic loci are the primary genetic basis of inbreeding depression and heterosis in rice. I. Biomass and grain yield.

    Genetics 2001, 158(4):1737-1753. OpenURL

  12. Stupar RM, Springer NM: Cis-transcriptional variation in maize inbred lines B73 and Mo17 leads to additive expression patterns in the F1 hybrid.

    Genetics 2006, 173(4):2199-2210. OpenURL

  13. Zhang X, Borevitz JO: Global analysis of allele-specific expression in Arabidopsis thaliana.

    Genetics 2009, 182(4):943-954. OpenURL

  14. Wei G, Tao Y, Liu G, Chen C, Luo R, Xia H, Gan Q, Zeng H, Lu Z, Han Y, et al.: A transcriptomic analysis of superhybrid rice LYP9 and its parents.

    Proc Natl Acad Sci U S A 2009, 106(19):7695-7701. OpenURL

  15. He G, Zhu X, Elling AA, Chen L, Wang X, Guo L, Liang M, He H, Zhang H, Chen F, et al.: Global epigenetic and transcriptional trends among two rice subspecies and their reciprocal hybrids.

    Plant Cell 2010, 22(1):17-33. OpenURL

  16. Paschold A, Jia Y, Marcon C, Lund S, Larson NB, Yeh CT, Ossowski S, Lanz C, Nettleton D, Schnable PS, et al.: Complementation contributes to transcriptome complexity in maize (Zea mays L.) hybrids relative to their inbred parents.

    Genome Res 2012, 22(12):2445-2454. OpenURL

  17. Bix M, Locksley RM: Independent and epigenetic regulation of the interleukin-4 alleles in CD4+ T cells.

    Science 1998, 281(5381):1352-1354. OpenURL

  18. Chess A, Simon I, Cedar H, Axel R: Allelic inactivation regulates olfactory receptor gene expression.

    Cell 1994, 78(5):823-834. OpenURL

  19. Lyon MF: X chromosomes and dosage compensation.

    Nature 1986, 320(6060):313. OpenURL

  20. Pernis B, Chiappino G, Kelus AS, Gell PG: Cellular localization of immunoglobulins with different allotypic specificities in rabbit lymphoid tissues.

    J Exp Med 1965, 122(5):853-876. OpenURL

  21. Zhang K, Li JB, Gao Y, Egli D, Xie B, Deng J, Li Z, Lee JH, Aach J, Leproust EM, et al.: Digital RNA allelotyping reveals tissue-specific and allele-specific gene expression in human.

    Nat Methods 2009, 6(8):613-618. OpenURL

  22. Gimelbrant A, Hutchinson JN, Thompson BR, Chess A: Widespread monoallelic expression on human autosomes.

    Science 2007, 318(5853):1136-1140. OpenURL

  23. Yan H, Yuan W, Velculescu VE, Vogelstein B, Kinzler KW: Allelic variation in human gene expression.

    Science 2002, 297(5584):1143. OpenURL

  24. Springer NM, Stupar RM: Allele-specific expression patterns reveal biases and embryo-specific parent-of-origin effects in hybrid maize.

    Plant Cell 2007, 19(8):2391-2402. OpenURL

  25. Guo M, Rupe MA, Zinselmeier C, Habben J, Bowen BA, Smith OS: Allelic variation of gene expression in maize hybrids.

    Plant Cell 2004, 16(7):1707-1716. OpenURL

  26. Lam HY, Pan C, Clark MJ, Lacroute P, Chen R, Haraksingh R, O’Huallachain M, Gerstein MB, Kidd JM, Bustamante CD, et al.: Detecting and annotating genetic variations using the HugeSeq pipeline.

    Nat Biotechnol 2012, 30(3):226-229. OpenURL

  27. Nodine MD, Bartel DP: Maternal and paternal genomes contribute equally to the transcriptome of early plant embryos.

    Nature 2012, 482(7383):94-97. OpenURL

  28. Stupar RM, Gardiner JM, Oldre AG, Haun WJ, Chandler VL, Springer NM: Gene expression analyses in maize inbreds and hybrids with varying levels of heterosis.

    BMC Plant Biol 2008, 8:33. OpenURL

  29. Springer NM, Stupar RM: Allelic variation and heterosis in maize: how do two halves make more than a whole?

    Genome Res 2007, 17(3):264-275. OpenURL

  30. Birchler JA, Veitia RA: The gene balance hypothesis: implications for gene regulation, quantitative traits and evolution.

    New Phytol 2010, 186(1):54-62. OpenURL

  31. Birchler JA, Yao H, Chudalayandi S, Vaiman D, Veitia RA: Heterosis.

    Plant Cell 2010, 22(7):2105-2112. OpenURL

  32. Fujimoto R, Taylor JM, Shirasawa S, Peacock WJ, Dennis ES: Heterosis of Arabidopsis hybrids between C24 and Col is associated with increased photosynthesis capacity.

    Proc Natl Acad Sci U S A 2012, 109(18):7109-7114. OpenURL

  33. Hochholdinger F, Hoecker N: Towards the molecular basis of heterosis.

    Trends Plant Sci 2007, 12(9):427-432. OpenURL

  34. Fischer RA, Kohn GD: The relationship of grain yield to vegetative growth and post-flowering leaf area in the wheat crop under conditions of limited soil moisture.

    Aust J Agric Res 1966, 17:281-295. OpenURL

  35. Mai M, Yokomizo A, Qian C, Yang P, Tindall DJ, Smith DI, Liu W: Activation of p73 silent allele in lung cancer.

    Cancer Res 1998, 58(11):2347-2349. OpenURL

  36. Hollander GA, Zuklys S, Morel C, Mizoguchi E, Mobisson K, Simpson S, Terhorst C, Wishart W, Golan DE, Bhan AK, et al.: Monoallelic expression of the interleukin-2 locus.

    Science 1998, 279(5359):2118-2121. OpenURL

  37. Rhoades KL, Singh N, Simon I, Glidden B, Cedar H, Chess A: Allele-specific expression patterns of interleukin-2 and Pax-5 revealed by a sensitive single-cell RT-PCR analysis.

    Curr Biol 2000, 10(13):789-792. OpenURL

  38. Wittkopp PJ, Haerum BK, Clark AG: Evolutionary changes in cis and trans gene regulation.

    Nature 2004, 430(6995):85-88. OpenURL

  39. Cowles CR, Hirschhorn JN, Altshuler D, Lander ES: Detection of regulatory variation in mouse genes.

    Nat Genet 2002, 32(3):432-437. OpenURL

  40. Guo M, Yang S, Rupe M, Hu B, Bickel DR, Arthur L, Smith O: Genome-wide allele-specific expression analysis using Massively Parallel Signature Sequencing (MPSS) reveals cis- and trans-effects on gene expression in maize hybrid meristem tissue.

    Plant Mol Biol 2008, 66(5):551-563. OpenURL

  41. Zhang M, Zhao H, Xie S, Chen J, Xu Y, Wang K, Guan H, Hu X, Jiao Y, Song W, et al.: Extensive, clustered parental imprinting of protein-coding and noncoding RNAs in developing maize endosperm.

    Proc Natl Acad Sci U S A 2011, 108(50):20042-20047. OpenURL

  42. Wolff P, Weinhofer I, Seguin J, Roszak P, Beisel C, Donoghue MT, Spillane C, Nordborg M, Rehmsmeier M, Kohler C: High-resolution analysis of parent-of-origin allelic expression in the Arabidopsis Endosperm.

    PLoS Genet 2011, 7(6):e1002126. OpenURL

  43. Waters AJ, Makarevitch I, Eichten SR, Swanson-Wagner RA, Yeh CT, Xu W, Schnable PS, Vaughn MW, Gehring M, Springer NM: Parent-of-origin effects on gene expression and DNA methylation in the maize endosperm.

    Plant Cell 2011, 23(12):4221-4233. OpenURL

  44. Luo M, Taylor JM, Spriggs A, Zhang H, Wu X, Russell S, Singh M, Koltunow A: A genome-wide survey of imprinted genes in rice seeds reveals imprinting primarily occurs in the endosperm.

    PLoS Genet 2011, 7(6):e1002125. OpenURL

  45. Shoemaker R, Deng J, Wang W, Zhang K: Allele-specific methylation is prevalent and is contributed by CpG-SNPs in the human genome.

    Genome Res 2010, 20(7):883-889. OpenURL

  46. Song R, Messing J: Gene expression of a gene family in maize based on noncollinear haplotypes.

    Proc Natl Acad Sci U S A 2003, 100(15):9055-9060. OpenURL

  47. Meyer RC, Witucka-Wall H, Becher M, Blacha A, Boudichevskaia A, Dormann P, Fiehn O, Friedel S, von Korff M, Lisec J, et al.: Heterosis manifestation during early Arabidopsis seedling development is characterized by intermediate gene expression and enhanced metabolic activity in the hybrids.

    Plant J 2012, 71(4):669-683. OpenURL

  48. Krieger U, Lippman ZB, Zamir D: The flowering gene SINGLE FLOWER TRUSS drives heterosis for yield in tomato.

    Nat Genet 2010, 42(5):459-463. OpenURL

  49. Ma Q, Hedden P, Zhang Q: Heterosis in rice seedlings: its relationship to gibberellin content and expression of gibberellin metabolism and signaling genes.

    Plant Physiol 2011, 156(4):1905-1920. OpenURL

  50. Li R, Yu C, Li Y, Lam TW, Yiu SM, Kristiansen K, Wang J: SOAP2: an improved ultrafast tool for short read alignment.

    Bioinformatics 2009, 25(15):1966-1967. OpenURL

  51. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq.

    Nat Methods 2008, 5(7):621-628. OpenURL

  52. Audic S, Claverie JM: The significance of digital gene expression profiles.

    Genome Res 1997, 7(10):986-995. OpenURL

  53. Wang W, Liu Z, Guo Z, Song G, Cheng Q, Jiang D, Zhu Y, Yang D: Comparative transcriptomes profiling of photoperiod-sensitive male sterile rice Nongken 58S during the male sterility transition between short-day and long-day.

    BMC Genomics 2011, 12:462. OpenURL