Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

Partial sequencing of the bottle gourd genome reveals markers useful for phylogenetic analysis and breeding

Pei Xu1, Xiaohua Wu1, Jie Luo2, Baogen Wang1, Yonghua Liu1, Jeffrey D Ehlers3, Sha Wang1, Zhongfu Lu1 and Guojing Li1*

Author Affiliations

1 Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, PR China

2 Institute of Digital Agricultural Research, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, PR China

3 Department of Botany and Plant Sciences, University of California, Riverside, CA 92521-0124 USA

For all author emails, please log on.

BMC Genomics 2011, 12:467  doi:10.1186/1471-2164-12-467


The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2164/12/467


Received:10 May 2011
Accepted:27 September 2011
Published:27 September 2011

© 2011 Xu et al; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

Bottle gourd [Lagenaria siceraria (Mol.) Standl.] is an important cucurbit crop worldwide. Archaeological research indicates that bottle gourd was domesticated more than 10,000 years ago, making it one of the earliest plants cultivated by man. In spite of its widespread importance and long history of cultivation almost nothing has been known about the genome of this species thus far.

Results

We report here the partial sequencing of bottle gourd genome using the 454 GS-FLX Titanium sequencing platform. A total of 150,253 sequence reads, which were assembled into 3,994 contigs and 82,522 singletons were generated. The total length of the non-redundant singletons/assemblies is 32 Mb, theoretically covering ~ 10% of the bottle gourd genome. Functional annotation of the sequences revealed a broad range of functional types, covering all the three top-level ontologies. Comparison of the gene sequences between bottle gourd and the model cucurbit cucumber (Cucumis sativus) revealed a 90% sequence similarity on average. Using the sequence information, 4395 microsatellite-containing sequences were identified and 400 SSR markers were developed, of which 94% amplified bands of anticipated sizes. Transferability of these markers to four other cucurbit species showed obvious decline with increasing phylogenetic distance. From analyzing polymorphisms of a subset of 14 SSR markers assayed on 44 representative China bottle gourd varieties/landraces, a principal coordinates (PCo) analysis output and a UPGMA-based dendrogram were constructed. Bottle gourd accessions tended to group by fruit shape rather than geographic origin, although in certain subclades the lines from the same or close origin did tend to cluster.

Conclusions

This work provides an initial basis for genome characterization, gene isolation and comparative genomics analysis in bottle gourd. The SSR markers developed would facilitate marker assisted breeding schemes for efficient introduction of desired traits.

Background

Bottle gourd [Lagenaria siceraria (Mol.) Standl.] (2n = 2x = 22), also known as calabash or opo squash, is a diploid belonging to the genus Lagenaria of the Cucurbitaceae family [1]. Phylogenetically, bottle gourd is close to many economically important cucurbit species including cucumber and melon that belong to the genus of Cucumis, as well as watermelon that belong to the genus Citrullus. Worldwide, bottle gourd is grown for its fruit either being harvested young and used as a vegetable or harvested mature and used as a bottle, utensil, or pipe. The fresh fruit, which usually has a light green smooth skin and a white flesh, is frequently used in many regions of Asia and Africa as either a stir-fry or soup vegetable ingredient [2]. Another recent utilization of bottle gourd is as rootstocks for watermelon against soil-borne diseases and low soil temperature [3,4].

Bottle gourd was one of the first crops to be domesticated. Based on archaeological evidence, bottle gourd is presumed to have been domesticated in Africa [5,6], and might have dispersed to the New World by ocean currents or by human migration in pre-historic times [7,8]. Africa is believed to be the centre of genetic diversity for bottle gourd, although wild progenitors of bottle gourd have not been identified there [6]. Substantial morphological variation for fruit and seed size, shape, color and rind hardness exists in the bottle gourd gene pool [8-10]. Yetisir et al. observed a wide range of morphological variation among Turkish bottle gourd accessions despite the fact that this region is not a center of origin of the crop [11].

At present, very few molecular genetic/genomic resources are publically available for bottle gourd. Achigan-Dako et al. measured the genome size of bottle gourd and showed that the nuclear 2C-value of DNA was around 0.734 pg, which is estimated to be equal to ~ 334 Mb [12]. In spite of the relatively small genome size of bottle gourd, there are only dozens of bottle gourd DNA sequences available in the public DNA database, making it unfeasible to identify bottle gourd genes or to analyze their functions. A limited number of anonymous random amplified polymorphic DNA (RAPD) markers have been described [10,13], but there has been no locus specific DNA markers such as microsatellite (SSR), sequence tagged site (STS) or single nucleotide polymorphism (SNP) markers available for bottle gourd so far. Also unclear is the extent of genome conservation/diversification between bottle gourd and other important cucurbit species such as the model cucumber (Cucumis sativus L.), which serves as the basis for comparative genomic analysis across cucurbit species.

Microsatellites, or simple sequence repeats (SSRs), are short repeat motifs usually associated with a high level of frequency of length polymorphism. With the advantages of being stable, PCR-based and relatively low-cost, SSR markers are one of the best choices for genetic research and molecular breeding. SSR markers can be developed, in case of the availability of large number of DNA sequences, in silico [14], or experimentally [15]. Traditionally, the experimental approach requires the construction of a genomic library enriched for repeated motifs, hybridization and isolation of microsatellite containing clones, sequencing of positive clones and primer design [16]. Most of these steps, especially the hybridization/isolation step, are expensive and time-consuming. Recent emerging 'next generation' sequencing technique, for instance, the 454 Genome Sequencer FLX (GS-FLX Titanium) shotgun System (Roche, Penzberg, Germany), provides a powerful alternative for generating a tremendous number of DNA sequences for genomics study and marker development. Instead of creating a conventional genomic library enriched for microsatellites, GS-FLX Titanium system sequences a shotgun library in a high-throughput manner, producing tens of thousands of reads around 300-400 bp. By mining the sequence reads, SSR-containing sequences can be identified. Using this technology, we partially sequenced the bottle gourd genome. Through assembling and annotating the sequence reads, tens of thousands of genes with broad range of functional types were recognized. Moreover, hundreds of microsatellite markers were developed using the sequencing data, which are invaluable in future marker assisted breeding and phylogeny analysis. The markers were then applied to a range of bottle gourd accessions to assess genetic diversity to enable more efficient parental line selection for breeding purposes and to dissect the genetic factors underlying morphological variations.

Methods

Plant materials

Forty-four accessions representing geographically and phenotypically different bottle gourd germplasm in China were used in this study (Figure 1; Table 1). The bottle gourd accession used for GS-FLX Titanium sequencing is 'Hangzhou gourd', a landrace from southern China. One accession of each of the following four cucurbits i.e. bitter gourd (Momordica charantia L.), loofah [Luffa acutangula (L.) Roxb], pumpkin (Cucurbita pepo L.) and watermelon [Citrullus lanatus (Thunb.)] were also used.

thumbnailFigure 1. Distribution of 44 Chinese bottle gourd accessions used in the current study. The solid triangles indicate collection sites of the materials, and the number following each triangle indicate number of accessions collected from the site.

Table 1. Cultivar or accession, origin, fruit shape class and type of the genotypes assayed for SSR polymorphisms

DNA extraction

Genomic DNA was extracted from leaves of two-week-old seedlings using a modified CTAB method [17].

DNA library construction and sequencing

To construct DNA library for GS-FLX Titanium sequencing, 5 mg of genomic DNA were fragmented into 300-800 bp by nebulization. Short adaptors were then ligated to the 3' and 5' ends. Emulsion PCR (emPCR) was carried out at a concentration of 1 copy per bead in six emulsion oils, to give 43,800 enriched beads. Amplified fragments were sequenced on 1/4th of an LR70 plate. The reads from GS-FLX Titanium sequencing were assembled with the software Newbler (http://rcc.uga.edu/software/app/newbler_GS_De_Novo_Assembler webcite/) under default parameters.

Functional annotation of genes and gene ontology analysis

Functional annotation of the sequences was performed by BLAST × search against the NCBI no-redundant (nr) protein database using the assembled contigs/singletons as queries. The cut-off value for significance was set as e-10. A putative gene ontology and functional category were obtained on the basis of GO Consortium (http://www.geneontology.org/ webcite) by BLAST2GO (http://www.blast2go.de webcite).

Alignment of gene sequences between genomes

The cucumber genome sequence was downloaded from Phytozome (ftp://ftp.jgi-psf.org/pub/JGI_data/phytozome/v6.0/Csativus/ webcite). A total of 16,135 bottle gourd contigs/singletons, which were functionally annotable under an E-value < e-10 and thus were considered originated from the gene space of the genome, were compared with the cucumber genome sequence by BLAST N under an E-value threshold of e-10 in at least 100 bp overlap. For comparison of the Cryptochrome 1 molecular clock marker genes and the UDP-glucosyltransferase genes among species, each gene sequence was download from Genbank (http://www.ncbi.nlm.nih.gov webcite) under the accession numbers of AB073546.1 (OsCRY1a, rice), EF601539.1 (TaCRY1a, wheat), AB498928.1 (GmCRY1, soybean), FE690583.1 (PvCRY1, common bean) or from the cucurbit unigene database (http://www.icugi.org/cgi-bin/ICuGI/EST/home.cgi?organism=melon webcite) under the accession number of MU45735 (CmCRY1, melon) and MU59780 (UDP-glucosyltransferase gene, melon), with the exception of cucumber CRY1 and UDP-glucosyltransferase genes, whose sequences were extracted from the cucumber genome database [18].

Microsatellites mining, primer design and SSR assay

The assembled contigs/singletons sequences were screened for perfect microsatellites using the software mreps 2.5 (http://bioinfo.lifl.fr/mreps/ webcite) [19]. The software Websat (http://wsmartins.net/websat webcite/) was used to design primers flanking SSRs [20]. Only sequences containing SSRs equal to or longer than 20 bp were used for primer design. The procedure of SSR assay followed Xu et al. [14].

Analysis of genetic diversity

The alleles present in each genotype were scored visually for each SSR locus. Number of alleles and allele frequency per locus were calculated manually. The computer program PIC_Calc 0.6 (http://www.esnips.com/doc/9171097b-ac41-424a-9d35-e7d4e540ec9f/Picalc webcite) was used to measure the polymorphism information content (PIC) value for each SSR locus under the formula PIC = 1-ΣPij2, where Pij is the frequency of jth allele of the ith locus [21]. Calculation of Nei's genetic distance (DA) and principal coordinates analysis (PCoA) were performed with NTSYSpc 2.10 [22]. A dendrogram showing relatedness among the 44 bottle gourd accessions were constructed using the unweighted pair-group method (UPGMA) based on the information of DA.

Results

Summary of the GS-FLX sequencing data

A ¼ run on the GS-FLX system generated 150,253 reads that passed the quality filters, giving a total length of 56,368,975 bp. The length of individual reads ranged from 23 bp to 700 bp, with an average of 375.2 bp. The majority of the read lengths fell between 350 bp and 500 bp. These sequences then were assembled into contigs based on sequence overlaps. After removing 75 long contigs (> 2 kb) that were found from a chloroplast/mitochondrial origin, 3,994 contigs ranging from 100 bp to 1,873 bp with an average length of 1236 bp and 82,522 singletons ranging from 23 bp to 649 bp with an average length of 362 bp were obtained (Table 2). These non-redundant contigs and singletons taken together represent ~32 Mb of the nuclear DNA sequence, covering ~10% of the bottle gourd genome. The original sequencing data is accessible at the DDBJ database under the accession number of DRR001005. The assembled contigs/singletons sequences can be downloaded from ftp://60.191.1.9 webcite under the user name of 'gourd' and password of 'sequence2011'.

Table 2. Length distribution of contigs and singletons

Functional annotation of the sequences

BLAST × search against the NCBI GenBank peptide database performed using the 3,994 sequence assemblies and 82,522 singletons resulted in 18,033 annotated sequences under an E-value threshold of e-10. After removing putative plastid/mitochondrial sequences and retrotransposon/transposon elements, 16,135 'clean' sequences were maintained. The lengths of the annotated sequences varied from 105 bp to 1,873 bp. Of these, 11,216 (69%) hit a gene function as hypothetical/predicted proteins or unknown/unnamed proteins while 4,919 (31%) sequences had known putative gene function annotations (Additional file 1).

Additional file 1. Functional annotation of the contigs/singletons. The annotation of putative functions of the contigs/singletons with an E-value equal to or smaller than e-10 through BLAST X.

Format: XLS Size: 4.1MB Download file

This file can be viewed with: Microsoft Excel ViewerOpen Data

Functional assignments for the 4,919 sequences with putative gene function annotations covered all three top-level ontologies i.e. cellular component, biological process and molecular function. Among those sequences that fell into the functional classification of molecular function, the largest categories were binding (40.8%), followed by catalytic activity (39.1%). In the class of biological process, cellular processing formed the major category (22.5%). Cell part (31.4%) is the dominant group of the cellular component classification (Figure 2).

thumbnailFigure 2. Gene ontology (GO) categories of 4919 non-redundant contigs/singletons with a putative functional annotation.

Conservation of gene sequences between bottle gourd and cucumber

To estimate the extent of sequence conservation between the gene spaces of bottle gourd and the model cucurbit cucumber, we compared 16,135 bottle gourd contigs/singletons that were assigned a functional annotation with the newly available cucumber genome sequence. BLAST N result showed that 13,370 bottle gourd sequences matched the cucumber genome in at least 100 bp overlap (Additional file 2). As expected, most of the matched sequences occur in the exon regions, giving an average sequence identity value of as high as 90.3%. Six hundred and fourteen bottle gourd sequences (4.6%) had more than 95% identity with cucumber, while 1252 sequences (9.4%) showed relatively low sequence conservation (less than 85% identity). Notably, we found that the gene Cryprochrome 1 (CRY1), which encodes a blue light receptor ubiquitous throughout the plant kingdom and that is frequently used phylogenic molecular clock marker [23,24], showed an identity value of as high as 93.5% in the conserved C-terminus DAS domains between the two species, demonstrating that the two species are phylogenetically very close. Another conserved plant gene, the UDP-glucosyltransferase gene, showed 85% sequence identity between bottle gourd and cucumber and a much higher sequence identity between melon and cucumber (93%, see discussion below).

Additional file 2. Sequence comparison between bottle gourd and cucumber. BLAST N result between bottle gourd gene space sequences and cucumber genome. Highlighted are percentage identity values.

Format: XLS Size: 2.9MB Download file

This file can be viewed with: Microsoft Excel ViewerOpen Data

Characterization of microsatellites in bottle gourd

A search against the sequenced bottle gourd genome for microsatellite-containing sequences hit 201 positive contigs and 3815 singletons at the threshold of SSR length ≥ 20 bp, harboring a total of 4395 discrete microsatellites. Of these, dinucleotide and dekanucleotide repeats are the most abundant, each accounting for ~13% of the total number. Trinucleotide repeats is also abundant, while mononucleotide and pentanucleotide repeats are relatively rare (Table 3). The length of the majority of the SSRs ranged from 20 to 56 nucleotides, with the longest up to 244 nucleotides. The number of repeat units varied between 2 and 122. Of the dominant dinucleotide and dekanucleotide repeats, AT/AT and TTCTCTCTCT/AGAGAGAGAA are the most frequent types of motif. AAT/ATT, TTTA/TAAA and AAAAAT/ATTTTT are the most common tri-, tetra- and hexa- nucleotide repeats, respectively (Figure 3). Clearly, AT rich repeats take up the majority of the microsatellites longer than 20 bp in the bottle gourd genome.

Table 3. Summary of the presence and type of simple sequence repeats (SSRs) longer than 20 bp

thumbnailFigure 3. Frequency of Di- (a), Tri- (b), Hexa- (c) and Deka- (d) nucleotide SSR motifs. In each panel only the five most frequent nucleotride motifs are shown.

Around 32% of the non-redundant microsatellite-containing sequences were suitable for design of flanking PCR primers. The rest of the microsatellite-containing sequences were less useful in primer development because the microsatellites were too close to fragment ends to enable design of flanking PCR primers. We designed 400 SSR markers (Additional file 3) from the contigs/singletons sequences and tested the amplification of 200 (LSR001-LSR200) of them. Ninety-four percent of the PCR primers amplified products with anticipated sizes (data not shown), demonstrating a high fidelity and efficiency for large scale SSR marker development by the GS-FLX sequencing approach.

Additional file 3. Bottle gourd SSR markers developed. Sequences and characteristics of the microsatellite markers developed from bottle gourd contig/singleton sequences.

Format: XLS Size: 112KB Download file

This file can be viewed with: Microsoft Excel ViewerOpen Data

Transferability of the microsatellite markers across species

To test the usefulness of the newly developed microsatellite markers in other understudied cucurbit species, we investigated their transferability to four other cucurbits, i.e. bitter gourd (Momordica charantia L.), loofah [Luffa acutangula (L.) Roxb], pumpkin (Cucurbita pepo L.) and watermelon [Citrullus lanatus (Thunb.)] by amplification with 200 primer pairs (LSR001-LSR200). Relatively low cross-species SSR transferability was observed except that between bottle gourd and watermelon who are both members of the subtribe Benincasinae, and, as expected, rate of marker transferability showed significant decline with increasing phylogenetic distance (Table 4). Using genomic sequences from non-expressed regions may partially account for the low marker transferability across species.

Table 4. Transferability of the bottle gourd SSR markers to four other cucurbits

Genetic diversity of 44 Chinese bottle gourd accessions as assessed by SSR markers

Fourteen primer pairs that detected polymorphisms in at least two of the four selected bottle gourd lines, i.e. 'Long gourd', 'Longyan April gourd', 'Nanxiu' and 'Yongzhen No. 1' (data not shown) were used to genotype 44 entries of Chinese bottle gourd accessions (Table 1). A total of 51 alleles with two to eight alleles per locus were detected among the accessions, providing an average allele number of 3.64 per locus. The overall polymorphism information content (PIC) value varied from 0.11 to 0.72 with an average of 0.4 (Table 5).

Table 5. Number of alleles and polymorphism information content of the markers used in genetic diversity analysis

A two-dimensional principal coordinates analysis (PCoA) did not detect significant subgrouping among the 44 lines, while the tendency of certain accessions to congregate together still can be observed (Figure 4). This distribution of the cultivars/landraces in general showed an association with fruit shape rather than geographic origin. For example, accessions with pyriform and tubby fruit formed a cluster in the upper right and upper left corners, respectively, while two round-fruited accessions clustered in the lower right corner. The rest of the accessions exhibited a scattered distribution along the two axes. Consistent with this, the dendrogram constructed from UPGMA analysis showed three major groups, which in general corresponds to the three clusters revealed by PCoA (Figure 5). The smallest group (group III) consisted of the two lines (No. 1 and 17) with round fruits. Lines in Group I were all landraces with a pyriform fruit except for 'Nanxiu' (No. 12), which is a commercial cultivar popular in central China with a slender straight fruit. Group II, the biggest class, consisted of 25 accessions with a slender straight fruit and 7 tubby-fruited accessions with six of the latter showed a clustered distribution in the dendrogram (Figure 5).

thumbnailFigure 4. Two-dimensional principal coordinates analysis of 44 bottle gourd cultivars/accessions. Different codes of the data points represent fruit shapes of the genotypes.

thumbnailFigure 5. A dendrogram of the 44 bottle gourd cultivars and accessions constructed using 14 microsatellite polymorphisms. Numbers indicate cultivar/accession as listed in Table 1. The ten accessions originating from Zhejiang province are marked by solid circles. The six accessions that harbor the intermediate tubby fruit and are clustered together are marked by open circles.

Even though no strong association was observed between the subgrouping and geographic origin of these accessions, cultivars or landraces sharing the same or close origins still tend to be clustered together in certain subclades. For instance, all the ten cultivars/landraces from Zhejiang province, a center of cultivation of bottle gourd in China were clustered together in group II (Figure 5).

Discussion

Through partial sequencing of the genome via the 454 GS-FLX Titanium sequencing platform, we were able to rapidly generate DNA sequence recourses for molecular marker development and genomic inquiry in bottle gourd, an 'orphan crop' for which few genomic resources have been developed thus far. Tens of thousands of sequences with putative functional annotation were identified, which will allow primer design or probe development for gene expression analysis, microarray assay, in silico cloning of the genes, as well as comparative analysis among cucurbits. The availability of bottle gourd genome sequences will be helpful to get a better understanding of some bottle gourd or cucurbits specific traits. For example, the sequences information will facilitate the identification of genes responsible for the highly efficient water transport system that is characteristic to bottle gourd and other cucurbits [25], and the hunt for genes related to the bitter taste-causing cucurbitacins biosynthetic pathway in cucurbits [26,27].

We provided the first insight of genome conservation/diversification between bottle gourd and the model cucurbit cucumber. We showed that the extent of gene space conservation between the two species is as high as 90%, demonstrating a close relationship between bottle gourd and cucumber. This is consistent with the result from analyzing the CRY1 molecular clock gene, which showed a 93% sequence identity between the two species in the C-terminus DAS domain. This value is higher than that between rice (Oryza sativa) and wheat (Triticum estivum) (69.4%), two related Poaceae species, and even higher than between the warm season legumes soybean (Glycine max) and common bean (Phaseolus vulgaris) (88%), indicating again that bottle gourd and cucumber are phylogenetically very close. However, the value is lower than that between melon (Cucumis melo) and cucumber (95%), which is consistent with the current phylogeny of cucurbits [28]. Similar results were obtained from analyzing the UDP-glucosyltransferase genes, where a much higher level of sequence identity was observed between melon/cucumber (93%) than between bottle gourd/cucumber (82%). Assay of SSR markers transferability across different cucurbit species also supported the known phylogeny and demonstrated that the bottle gourd SSR markers could be selectively used for watermelon (41% amplification rate), and loofah (20% amplification rate) if necessary, due to their relatively higher cross-species transferability.

Another direct use of the sequencing information is to develop large number of microsatellite markers for marker-assisted breeding. The quick generation of over 150,000 sequence entries that enabled development of thousands of SSR markers within only 1 week at low cost is far superior to the traditional, hybridization and Sanger sequencing based method [15,29] in terms of time, labor and other costs. The GS-FLX Titanium system was chosen because it generates longer sequence length (~ 400 bp) per read than most other next generation sequencing systems, which is important for the subsequent design of SSR primers flanking the microsatellite motifs. We identified 4395 SSRs longer than 20 bp from the non-redundant 32 Mb bottle gourd genome sequence, which provides a frequency of 1 SSR per ~7.3 Kb. This frequency is nearly double the estimation from cucumber (1 SSR per ~14.6 Kb) using 3x shotgun genome sequencing data [30], and demonstrates that SSRs could serve as a rich source for marker development in bottle gourd. The high frequency of dinucleotide and trinucleotide repeats is consistent with the situation in most other plant species including the cucurbits cucumber and watermelon [29-31]; however, the significantly high portion of dekanucleotide repeats could be a feature of the bottle gourd genome although dekanucleotide repeats is also common in other plant genomes such as cowpea [31]. The AT-rich nature of the microsatellite motifs is conserved between bottle gourd and cucumber [30].

A dendrogram established based on SSR genotyping of 44 representative China bottle gourd cultivars/landraces didn't detect obvious clustering by geographical location, which is in agreement with Yetisir et al. in which clustering of bottle gourd accessions from Turkey was based around fruit morphology much more than on geographical origin [11]. Founder effects followed by assortive mating, i.e. the original introduction of only limited genetic diversity within fruit types, followed by matings mostly within fruit types, would lead to the patterns of genetic diversity observed. This is supported by the relatively high genetic similarity observed among the bottle gourd lines, which varied between 51.2% and 94.3%. Decker-Walters et al. (2001) characterized 74 landraces/cultivars from a global sample and revealed that the lines from diverse origins (Africa, Asia and the New World) were readily separated [10]. Consistent with the result from Morimoto et al. (2005), fruit shape was found a principal component of the variation and is in general associated with the grouping of the lines based on molecular markers [8]. Our results indicate that China bottle gourd germplasm could be divided into three major groups in terms of fruit shape, i.e. slender straight, tubby and round, although the variation of fruit shape is quantitative. Heiser proposed that bottle gourd plants producing large round fruits are typically native to tropical West Africa, whereas the long, thin, snake-like fruits are considered to be of Asian origin [9]. This, if true, is indicative of a mixed origin of Chinese bottle gourd germplasm. The presence of the pyriform and tubby fruit lines, which are considered an intermediate type, could be indicative of natural or artificial hybridization between the two ancient cultivar groups. Relatively recent human migration events and recent germplasm introduction activities may further blur the patterns of diversity as revealed by the imperfect association between the morphology of the lines and their grouping.

Conclusions

We report here the generation of 454 GS-FLX Titanium sequencing data of the bottle gourd genome and its application to SSR marker discovery and genetic diversity analysis. The sequence information will allow characterization of the bottle gourd genome, facilitate gene isolation and comparative genomics analysis across species. The SSR markers developed will enable marker assisted breeding of bottle gourd, while the characterization of patterns of diversity among representative China bottle gourd accessions will facilitate the optimal use of genetic resources for breeding. In the near future, with more and more genome sequence information of other cucurbits becoming available [18,32], soon it will be feasible to draw deeper and clearer insights into genome conservation/diversification among related crop cucurbit species.

Authors' contributions

PX and GL designed the experiments. XW and SW carried out the experiments. BW, YL and ZL participated the field work and trait evaluation. PX analyzed all the data and performed computational analyses, with the assistance from JL. PX drafted the manuscript and JDE and GL revised the manuscript. All authors read and approved the final manuscript.

Acknowledgements

This study was partially supported by the National Science Foundation of China (No. 30900985) and the Natural Science Foundation of Zhejiang Province (No. Y3090146). We thank Dr. Haibin Xu for technical support in bioinformatical analysis.

References

  1. Beevy SS, Kuriachan P: Chromosome numbers of south Indian Cucurbitaceae and a note on the cytological evolution in the family.

    J Cytol Genet 1996, 31:65-71. OpenURL

  2. Morimoto Y, Mvere B: Lagenaria siceraria. In Vegetables Plant resources of Tropical Africa 2. Edited by Grubben GJH, Denton OA. Backhuys Publishers/CTA, Wageningen/Leiden; 2004:353-358. OpenURL

  3. Lee JM: Cultivation of grafted vegetables I. Current status, grafting methods and benefits.

    HortScience 1994, 29:235-239. OpenURL

  4. Yetisir H, Sari N: Effect of different rootstock on plant growth, yield and quality of watermelon.

    Aust J Exp Agric 2003, 43:1269-1274. Publisher Full Text OpenURL

  5. Decker-Walters DS, Wilkins-Ellert M, Chung SM, Staub JE: Discovery and genetic assessment of wild bottle gourd [Lagenaria siceraria (Mol.) Standley; Cucurbitaceae] from Zimbabwe.

    Econ Bot 2004, 58:501-508. OpenURL

  6. Whitaker TW: Endemism and pre-Columbian migration of bottle gourd, Lagenaria siceraria (Mol.) Standl. In Man across the sea. Edited by Riley CL, Kelley JC, Pennington CW, Runds RL. University of Texas Press, Austin; 1971:78-218. OpenURL

  7. Erickson DL, Smith BD, Clarke AC, Sandweiss DH, Tuross N: An Asian origin for a 10,000-year-old domesticated plant in the Americas.

    Proc Natl Acad Sci USA 2005, 102:18315-20. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  8. Morimoto Y, Maundu P, Fujimaki H, Morishima H: Diversity of landraces of the white-flowered gourd (Lagenaria siceraria) and its wild relatives in Kenya: fruit and seed morphology.

    Genet Resour Crop Evol 2005, 52:737-747. Publisher Full Text OpenURL

  9. Heiser CB: The gourd book: A thorough and fascinating account of gourds from throughout the world. University of Oklahoma Press, Norman, Oklahoma; 1979. OpenURL

  10. Decker-Walters D, Staub J, Lopez-Sese A, Nakata E: Diversity in landraces and cultivars of bottle gourd (Lagenaria siceraria; Cucurbitaceae) as assessed by random amplified polymorphic DNA.

    Genet Resour Crop Evol 2001, 48:369-380. Publisher Full Text OpenURL

  11. Yetisir H, Sakar M, Serce S: Collection and morphological characterization of Lagenaria siceraria germplasm from the Mediterranean region of Turkey.

    Genet Resour Crop Evol 2008, 55:1257-1266. Publisher Full Text OpenURL

  12. Achigan-Dako EG, Fuchs J, Ahanchede A, Blattnerv FR: Flow cytometric analysis in Lagenaria siceraria (Cucurbitaceae) indicates correlation of genome size with usage types and growing elevation.

    Plant Syst Evol 2008, 276:9-19. Publisher Full Text OpenURL

  13. Morimoto Y, Maundu P, Kawase M, Fujimaki H, Morishima H: RAPD polymorphism of the white-flowered gourd (Lagenaria siceraria (Molina) Standl. landraces and its wild relatives in Kenya.

    Genet Resour Crop Evol 2006, 53:963-974. Publisher Full Text OpenURL

  14. Xu P, Wu XH, Wang BG, Liu YH, Qin DH, Ehlers JD, Close TJ, Hu TT, Lu ZF, Li GJ: Development and polymorphism of Vigna unguiculata ssp. unguiculata microsatellite markers used for phylogenetic analysis in asparagus bean (Vigna unguiculata ssp. sesquipedialis (L.) Verdc.).

    Mol Breeding 2010, 25:675-684. Publisher Full Text OpenURL

  15. Li CD, Fatokun CA, Ubi B, Singh BB, Scoles GJ: Determining genetic similarities and relationships among cowpea breeding lines and cultivars by microsatellite markers.

    Crop Sci 2001, 41:189-197. Publisher Full Text OpenURL

  16. Zane L, Bargelloni L, Patarnello T: Strategies for microsatellite isolation: a review.

    Mol Ecol 2002, 11:1-16. PubMed Abstract | Publisher Full Text OpenURL

  17. Maguire T, Collins G, Sedgley M: A modified CTAB DNA extraction procedure for plants belonging to the family proteaceae.

    Plant Mol Biol Rep 1994, 12:106-109. Publisher Full Text OpenURL

  18. Huang SW, Li RQ, Zhang ZH, Li L, Gu XF, Fan W, Lucas WJ: The genome of the cucumber, Cucumis sativus L.

    Nat Genet 2009, 41:1275-1281. PubMed Abstract | Publisher Full Text OpenURL

  19. Kolpakov R, Bana G, Kucherov G: mreps: efficient and flexible detection of tandem repeats in DNA.

    Nucl Acids Res 2003, 31:3672-3678. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  20. Martins WS, Lucas DCS, Neves KFS, Bertioli DJ: WebSat-A web software for microsatellite marker development.

    Bioinformation 2009, 3:282-283. PubMed Abstract | PubMed Central Full Text OpenURL

  21. Anderson JA, Churchill GA, Autrique JE, Tanksley SD, Sorrells ME: Optimizing parental selection for genetic linkage maps.

    Genome 1993, 36:181-186. PubMed Abstract | Publisher Full Text OpenURL

  22. Rohlf FJ: NTSYSpc, numerical taxonomy and multi-variate analysis system, version 2.1 user guide. Exeter Software, Setauket

  23. Perrotta G, Ninu L, Flamma F, Weller JL, Kendrick RE, Nebu-loso E, Giuliano G: Tomato contains homologues of Arabidopsis Cryptochrome 1 and 2.

    Plant Mol Biol 2000, 42:765-773. PubMed Abstract | Publisher Full Text OpenURL

  24. Xu P, Xiang Y, Zhu HL, Xu HB, Zhang ZZ, Zhang CQ, Zhang LX, Ma ZQ: Wheat cryptochromes: subcellular localization and involvement in photomorphogenesis and osmotic stress responses.

    Plant Physiol 2009, 149:760-774. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  25. Zhang B, Tolstikov V, Turnbull C, Hicks LM, Fiehn O: Divergent metabolome and proteome suggest functional independence of dual phloem transport systems in cucurbits.

    Proc Natl Acad Sci USA 2010, 107:13532-13537. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  26. Miró M: Cucurbitacins and their pharmacological effects.

    Phytother Res 2006, 9:159-168. OpenURL

  27. Sharma A, Sharma JP, Jindal R, Kaushik RM: Bottle gourd poisoning.

    JK Sci 2006, 8:120-121. OpenURL

  28. Bisognin DA: Origin and evolution of cultivated cucurbits.

    Cienc Rural 2002, 32:715-723. Publisher Full Text OpenURL

  29. Jarret RL, Merrick LC, Holms T, Evans J, Aradhya MK: Simple sequence repeats in watermelon (Citrullus lanatus (Thunb.) Matsum. & Nakai).

    Genome 1997, 40:433-441. PubMed Abstract | Publisher Full Text OpenURL

  30. Ren Y, Zhang Z, Liu J, Staub JE, Han Y, et al.: An integrated genetic and cytogenetic map of the cucumber genome.

    PLoS ONE 2009, 4:e5795. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  31. Timko MP, Rushton PJ, Laudeman TW, Bokowiec MT, Chipumuro E, Foo Cheung F, Town CD, Chen XF: Sequencing and analysis of the gene-rich space of cowpea.

    BMC Genomics 2008, 9:103. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  32. González VM, Benjak A, Hénaff EM, Mir G, Casacuberta JM, Garcia-Mas J, Puigdomènech P: Sequencing of 6.7 Mb of the melon genome using a BAC pooling strategy.

    BMC Plant Biol 2010, 10:246. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL