Association Mapping â€" A Novel Genomic Approach for Unraveling Genetic Variation in Crop Plants
Author: Ganapathy KN | Co-authors: Sunil Gomashe and Sujay Rakshit

The aim of any genetic mapping studies is to identify QTLs/genomic regions governing phenotypic variation. Until recently, mapping of agrnomically important traits was mostly done using segregating populations (F2, F3) or stabilized population (RILs) obtained from biparental matings. The basic requirements of parents selected for biparental mapping is that it should be diverse both at molecular level and for the traits of interest. Unlike biparental mapping, association mapping utilizes the ancestral recombination events and natural variation existing among the germplasm/genotypes for genetic dissection of quantitative traits. Association mapping is also known as linkage disequilibrium (LD) (non random associations of alleles) mapping and focuses on genetic associations between non related individuals. Association mapping offers many advantages over conventional biparental mapping such as increased mapping resolution, reduced mapping time and detection of all possible alleles for the target traits.

Association mapping is classified into two major categories

A. Candidate gene based association mapping: Candidate gene based association mapping is carried out for traits where prior information about the genes governing the target traits are known. Moreover, it requires good understanding of the biochemistry and genetics of the target traits. Candidate gene based association mapping requires less number of markers and many candidate gene based association studies are done with tens to hundreds of markers. However, in candidate gene approach there is always a chance of missing some genes which might have role in the function for the traits targeted.

B. Genome wide/whole genome association mapping: This is done in population where there is no prior information about the genomic regions governing the phenotypic variation for the target traits. This approach scans the whole genome for polymorphism to identify the causative genomic regions associated with target traits. Generally SSR markers are used for genome wide association scans but recently with availability of high throughput sequencing techniques like next generation sequencing technologies (Genotyping by sequencing) strategies and less time required for sequencing, it is now possible to identify the whole genome sequence variation at SNP level and associate it with the target traits.

Major requirements/steps in Association mapping
Genetic materials: Choice of genetic material is very critical requirement for association mapping. The genotypes selected for association mapping should not be identical by descent. Theoretically, a non-structured population is preferred for association mapping since association mapping in structured population can lead to spurious marker-trait associations. A population with subtle population structure and substantial variability for target traits are more ideal for association mapping although different methods and statistical tools are available for correction of population structure in structured populations.

Analysis of genetic structure: The genetic materials/genotypes selected for association mapping studies should be subjected to genetic structure analysis before proceeding for association analysis. For genetic structure analysis, the genotypes will be subjected to genotyping with well characterized SSR/SNP markers which have uniform coverage throughout the genome. Statistical analysis of genetic structure is done using STRUCTURE software (Pritchard et al. 2000).
Phenotyping of the association panel for target traits: Multilocation phenotyping with appropriate field design and required number of replications is basic requirement for association mapping. Efficient statistical designs such as α-lattice maybe used since it can partition the environmental variance from the phenotypic variance more efficiently. Secondly, the population size is an important consideration. Large size population provides more power but a population size of 200 - 250 is desirable. Sufficient care should also be taken while recording field observations for the traits of interest.

Genotyping of the association panel: Before carrying out genotyping with DNA markers such as SSRs or SNPs it is necessary to understand the linkage disequilibrium in the association panel as the number of background markers for association mapping depends on LD. Rafalski, 2010 reported that " If the LD decreases to approximately r2 = 0.5 in, on average, 2 cM, in a 1000 cM genome one would have about 1000/2 = 500 blocks of linkage, requiring perhaps 2500 SNP markers or preferably much more, to distinguish common haplotypes and account for large variation in rate of LD decay". Practically, the markers selected for genotyping should be uniformly distributed and should provide dense coverage throughout the genome.

Association analysis
The phenotypic and genotyping data is used as an input for TASSEL software for identifying marker-trait associations (Bradbury et al. 2007). TASSEL is the most commonly used software for association mapping in plants. The Q matrix of inferred ancestry coefficients obtained from STRUCTURE software may be used as covariate for association analysis to take care of population structure. For further refinement of results the kinship coefficients (K matrix) are also used in association analysis especially in multiple linear model of association analysis.

Selected References:
- Bradbury, P. J., Zhang, Z., Kroon, D. E., Casstevens, T. M., Ramdoss, Y., Buckler E. S., 2007, TASSEL: Software for association mapping of complex traits in diverse samples. Bioinformatics 23:2633-2635.
- Pritchard, J. K., Stephens M. and Donnelly, P., 2000, Inference of population structure using multilocus genotype data. Genetics 155: 945-959.
- Rafalski, J. A., 2010, Association genetics in crop improvement. Current Opinion in Plant Biology 13:174â€"180.


About Author / Additional Info:
I am a scientist currently working on genetic improvement of grain quality traits in sorghum using association mapping approaches.