Draft genome sequence of cucumber (Cucumis sativus) reveals mechanism of monoecy and crop improvement traits.

Prasanta K Dash*, Biranchi N Patra^ and Rhitu Rai*

* National Research Centre on Plant Biotechnology, PUSA Campus, New Delhi, India
^ Keck Graduate Institute, Claremont, Ontario, California, USA


Cucumber (Cucumis sativus) is commonly classified as cucurbit/gourd and belongs to the botanical family "Cucurbitaceae" that includes a number of cultivated species of economic importance such as watermelon (Citrullus lanatus) and squash (Cucurbita spp.). It is an important fruit crop, produced worldwide (Fao statistics) and is particularly important in Mediterranean and Asian countries [1]. World-wide it is the fourth most important vegetable crop and is believed to be indigenous to India and speculated to have originated in the foothills of the Himalayan Mountain [2],[3]. Its cultivated variety Cucumis sativus cv. sativus and the wild progenitor Cucumis sativus var. hardwickii were discovered in this region. It is believed that cultivation of cucumber spreaded to secondary centre of diversity such as China through the "Silk route" during the Han Dynasty because a unique semi-wild landrace, named "Xishuangbanna" Gourd, was discovered in the province Yunnan in Southwest China [1]. Subsequently, it is believed to have spreaded to other regions of East Asia, Western Asia, and Southern Europe.

Cucumber as a Model Plant

Among plant biologists cucumber is gaining preference as a new model species for research due to its specific biological properties and economic importance. Further, it is garnering importance as an attractive model for studying valuable biological characters because of its rich variability in observable phenotypic characters, short-life cycle compared to other crop species (three months from seed to seed and comparable to Arabidopsis), rich diversity of sexual dimorphism, fruit ripening, phloem physiology, less number of genes in its genome, and availability of resources in genetics and genomics possible due to deciphered genome sequence [4],[5]. Inter alia, rapid advance of next generation sequencing (NGS) technologies has made it affordable to re-sequence multiple genotypes of many crops including cucumber that lead to generate a haplotype map that displays genome-wide patterns of genetic variation at a single base resolution [6].

Genomics in cucumber

Recently, a number of genetic and molecular tools for cucumber have been developed and many are in progress such as saturated genetic maps, ESTs, microarray chips, physical map, BAC sequences, and reverse genetic tools. To complete the repertoire of genomic tools, de novo sequencing of the cucumber genome was undertaken by Huang et al 2009 [5]. For the sequencing purpose, cucumber inbred line 'Chinese long' was sequenced using a hybrid approach of Sanger sequencing and Illumina genome analyzer. While Sanger's sequence read provided longer reads (400-600 bp) and covered 3.9-fold coverage; Illumina GA reads (42 to 53 bp) provided 68.3-fold coverage of cucumber genome. The sequences was compared by overlapping all the three sequence reads such as:
(1) Assemblies obtained by Sanger reads only
(2) Illumina GA reads only and
(3) Sanger plus Illumina hybrid reads.

Of the estimated 350 Mb genome size determined by flow cytometry of isolated nuclei; 243.5 Mb covering 70% of estimated cucumber genome was deciphered by the sequencing efforts. The draft genome sequence of cucumber predicts 26,682 genes, with a mean coding gene size of 1,046 bp with an average of 4.39 exons/gene. The sequence also revealed presence of large number of transposable elements that were not identified earlier. Approximately, 24% of the genome i.e 54.4 Mb is represented by these elements in the form of repeats. Among them, the long terminal repeat (LTR) retro-transposons (gypsy and copia elements) constituted majority of the transposable element classes covering 10.4% of the genome [5].

An important feature that emerged from genome sequence data is that the "Whole-genome duplication" (WGD) process is absent on cucumber. This process is very common in angiosperms because genome duplication provides vast array of nucleic acid sequence as raw material for gene genesis. The model plant Arabidopsis thaliana, grapevine, and rice underwent the WGD process during evolution. However, this evolutionary important recurrent WGD is absent in cucumber genome and provides an important insight into the grapevine and papaya genomes to study ancestral forms and arrangements of plant genes [5].

Disease resistance and signature molecules

Although, melon and cucumber belong to the same genus; cucumber has seven chromosomes (2n=14), melon has 12 (2n=24) and a distant relative watermelon has 11 (2n=22) chromosomes. Genome sequence, further, revealed that the cucurbit chromosomal evolution and rearrangement likely occurred before the divergence of cucumber and melon. Many disease resistance genes in cucumber were identified by unraveling the genome sequence. Contrary to 200 disease resistant genes present in Arabidopsis, 398 genes in poplar, 55 genes in papaya, and 600 genes in rice; only 61 nucleotide-binding site (NBS)-containing resistance (NBS-R) genes have been identified in cucumber. These genes are non-randomly distributed in cucumber genome with only five genes located on chromosomes 1, 6 and 7 and 20 genes located on chromosome 2. In wild melon genotypes, enhanced expression of two disease controlling glyoxylate aminotransferase genes (At1 and At2) controls the resistance to the devastating foliar disease downy mildew of cucurbits. Presence of two At homologs genes have been annotated onto cucumber genome that can be used as candidate genes for downy mildew resistance [5].

Cucurbitacin (a bitter compound) is a signature metabolite present in all cucurbits. This metabolite is a triterpenoid compound that imparts bitterness to cucurbit and is toxic to most organisms. The presence of cucurbitacin in the cucumber is controlled by the gene bi and the enzyme oxido-squalene cyclase catalyzes the formation of this triterpene carbon compound in plants. Genome sequence of cucumber revealed presence of four OSC genes and this gene cluster is conjectured to catalyze the stepwise formation of cucurbitacin in cucumber.

Like most fruits, cucumber is an important component of salads in daily diet. The modern cucumber varieties, like many other crops, have evolved from their wild ancestors which have lowered acceptability for human consumption. While, wild cucumber plants bear extremely bitter fruit, an essential step in the domestication of the wild cucumber into modern fruit have been complete/partial loss of fruit bitterness. Two genetic loci, Bi and Bt, are known to confer bitterness in cucumber. The recessive bi allele confers bitter-free foliage, and the dominant Bt allele renders the fruit extremely bitter. However, the process of evolution, domestication and breeding efforts has shaped the current varieties of cucumber by drastic loss of diversity in genomic regions carrying genes conferring unfavorable taste such as bitterness. Genomic studies reveal potential domestication sweeps in the cucumber genome between wild and cultivated groups. In total, 112 potential selective sweeps ranging from 50 kb to 800 kb in length (Average 138 kb) were identified by resequencing of 115 cucumber genotypes collected from all over world [7]. One of these sweep regions contains a gene involved in the loss of bitterness in fruits, an essential domestication trait of cucumber. Using simple sequence repeat (SSR) markers the Bi and Bt loci were mapped to chromosomes 6 and 5, respectively. By resequencing of 115 cucumber genotypes, the Bt locus was delimited to 442-kb region on chromosome 5 and found to harbor 67 predicted genes. Further, genomic studies are expected to narrow down the gene responsible for bitterness to few candidate genes that can be eliminated by RNAi/ other knock down procedures [7].

Monoecy and organ formation

Among all crop species, cucumber is a model system for studying sex expression traits. Phyto-hormone, ethylene stimulates femaleness in cucumber and is considered the sex hormone of cucurbits. Genome sequence revealed 137 cucumber genes are related to the biosynthetic and signaling pathways of ethylene but no gene family expansion/duplication in these pathways were observed as compared to other sequenced plant genomes. Thus, it was contemplated that origin of monoecy in cucumber involves evolutionary mechanisms other than ethylene. To better understand the mechanism of sex determination in cucumber, analysis of genome sequence revealed that six auxin-related genes (auxin regulates sex expression by stimulating ethylene production) and three short-chain dehydrogenase or reductase genes (homologs to the sex determination gene ts2 in maize) are highly expressed in unisexual flowers. This analysis provides an important insight for further study of sex determination in cucumber.

Development of tendrils is a unique phenomenon in cucurbits and is a key innovation in plant evolution. In cucumber and grapevine, another phyto-hormone gibberellic acid regulates tendril formation. Since tendril coiling involves rapid cell wall modification, it is speculated that expansins (cell wall-loosening proteins) play a major role in the formation of this specialized organ unique to cucurbits. In cucumber, the expansin subfamily EXLA has undergone extensive expansion through tandem duplication (eight genes in cucumber, compared to one to three genes in other genomes) and is believed to have contributed to the development of tendril coiling in cucumber.


Increased availability of genome sequences from higher plants and microbes provides an important tool for understanding microbial diversity plant evolution and genetic variability existing within cultivated species [8],[9],[10]. Genome sequences are also becoming strategic tools for the development of methods to accelerate plant breeding [11]. Cucurbitaceae is the second most economically important group of vegetable crops and cucumber occupies a key position in this family for its high economic value and as a model crop to study biologically relevant characters. Thus, deciphering structural genome sequence has added value to cucumber breeders with an additional tool in breeding programs.

1. Lv J et al. (2012). Genetic diversity and population structure of cucumber (Cucumis sativus L.). PLoS One ;7(10): e46919.

2. Tatlioglu T. (1993). Cucumber Cucumis sativus L. Genetic improvement of vegetable crops.Tarrytown: Pergamon Press Ltd 197-234.

3. Sebastian P et al. (2010). Cucumber (Cucumis sativus) and melon (C. melo) have numerous wild relatives in Asia and Australia, and the sister species of melon is from Australia. Proc Natl Acad Sci USA. 107: 14269-14273.

4. Ren Y et al. (2009). An integrated genetic and cytogenetic map of the cucumber genome. PLoS One. 4: e5795.

5. Huang S et al. (2009). The genome of the cucumber, Cucumis sativus L.Nat Genet. 41: 1275-1281.

6. Weigel D and Mott R (2009). The 1001 genomes project for Arabidopsis thaliana. Genome Biol. 10: 107.

7. Qi J et al. (2013). A genomic variation map provides insights into the genetic basis of cucumber domestication and diversity. Nat Genet. 45(12):1510-1515.

8. Wang et al. (2012). The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads. The Plant Journal. 72 (3), 461-473.

9. Dash PK. (2013). Decoding flax genome for structural genomics and functional insights into yield genes. Plant and Animal genome XXI conference.

10. Rai R et al (2012). Phenotypic and molecular characterization of indigenous rhizobia nodulating chickpea in India. Ind J Expt Biol. 50(5): 340-350.

11. Koundal KR (2003). Plant defense proteins: Mechanism and potential for pest control through genetic manipulation. J Plant Biol. 30(2), 211-228.

12. Image source: http://www.morguefile.com/archive/display/138299 | By rollingroscoe - Copyright free photo

About Author / Additional Info:
( http://faostat.fao.org )