Allele mining applications and challenges in crop improvement
Author: Jitendra kumar Meena

Allele mining is a promising approach to dissect naturally occurring allelic variation at candidate genes controlling key agronomic traits which has potential applications in crop improvement programs. It is the process of detecting the new superior alleles for various traits like disease resistance, drought tolerance, quality etc. These alleles can be used in breeding programmes to produce new efficient high yielding crop varieties in order to address the emerging challenges. Notwithstanding collection of large number of germplasm lines, the available genetic variation has not been explored and utilized efficiently; still there is enough scope to identify large number of superior alleles. Therefore allele mining can be efficiently utilized to tap the potential genetic diversity

A true allele mining involves the consideration of variations in both expressed as well as non-expressed regions of the gene. It includes 5’UTR, promoter, introns, exons, 3’UTR, splice sites, etc.


There are enough examples available to say that intronic mutations plays important role in the creation of allelic diversity which have potential to alter the phenotype. In recent times sequence variation in the regulatory regions of the gene is gaining more importance as it directly involved in gene expression. Two major approaches are available for the identification of sequence polymorphisms for a given gene in the naturally occurring populations. They are (i) Modified TILLING (Targeting Induced Local Lesions in Genomes) procedure called EcoTilling and (ii) Sequencing based allele mining.


Allele mining play an important role in tracing of superior alleles, through ‘mining’ the gene of interest from diverse genetic resources. In addition, the rate of evolution of alleles; allelic similarity/dissimilarity at a candidate gene and allelic synteny with other members of the family, and identify the nucleotide sequence changes associated with superior alleles can also be studied. Allele mining may also pave way for molecular discrimination among related species, development of allele-specific molecular markers, facilitating introgression of novel alleles through MAS or deployment through genetic engineering (GE).

1. Identification and access to allelic variation that affects the plant phenotype is of utmost importance for the utilization of genetic resources in crop improvement. Allele mining seems to be a promising, although largely untested method to unlock the diversity in the collections of genetic resources in the world gene banks (Kaur et al., 2008a).

2. Allele mining can be potentially employed in the identification of nucleotide variation at a genomic region (candidate gene) associated with phenotypic variation for a trait. Through this, one can evaluate the frequency, type and the extent of occurrence of new haplotypes and the resulting phenotypic changes. Many case studies are available for analysing the haplotype diversity and identification of new haplotypes at candidate genes. Knowledge on the most common haplotype changes and their frequency in the populations would form the basis for association mapping studies.

3. Identification of sequence variation will pave the way to develop allele-specific marker assay for precise introgression of the identified ‘superior and/or novel’ alleles to suitable genetic background. In recent years, several case studies are being reported which demonstrate the existence of sequence variation at key genes while some studies have demonstrated the utility of these variations through development of allele-specific molecular markers for MAS. For instance, in rice, comparison of nucleotide sequences of Waxy gene (codes for a granule-bound starch synthase) in 18 different accessions revealed the presence of five different alleles, which are characterized with a unique replacement, frame shift or splice donor site mutations. All the alleles were clearly associated with the observed phenotypic alterations (Mikami et al., 2008).

4. Using the sequence information obtained from allele mining studies, syntenic relationships can be assessed among the identified loci/genes across the species/genera. In rye, superior homologue alleles for aluminum tolerance were isolated using syntenic allele sequence information from wheat, and the same technique has been employed to isolate agronomically superior alleles in Phaseolus vulgaris and other grasses (Fontecha et al., 2007).


Several challenges are take place hurdle in efficient and effective utilization of allele mining to tap the potential genetic diversity in breeding field crops.

1. The foremost challenge in unlocking the existing variation is the selection of germplasm to be ‘mined’. Given a reliable protocol for characterizing a gene, screening the entire collection would certainly be helpful to find rare alleles, but this is an enormous and inefficient way of screening. Therefore, required number of genotypes should be selected through formation of mini core collection representing maximum diversity of the main accessions. To avoid redundancy or repetition of genotypes, efficient computational tools like ‘Powercore’ may be used.

2. Efficient phenotyping is highly important in allele mining. An inaccurate phenotyping may misleads and eventually end up with wrong results. Precise phenotyping increases the chances of detecting a potential superior allele.

3. Relevant and efficient bioinformatics tools should be used to handle the huge sequencing and other data created in the course of allele mining process.

4. Identification of putative promoter region and regulatory regions, and exonic regions is a difficult task during the genic diversity studies. Variation in both expressed and non-expressed region may have significant effect on phenotype therefore proper care has to be taken while characterizing the gene.

5. Involvement of high cost is the major hurdle in sequence based allele mining. However, efficient sequencing techniques like Next Generation Sequencing may considerably reduce the cost of sequencing and thus make the process cost effective.

About Author / Additional Info:
Division of genetics, Indian Agricultural Research Institute, Pusa, New Delhi-110012