The scientific discipline involving mapping, analyzing, and sequencing genomes is known as genomics. In drug development, functional genomics combines with methodologies such as bioinformatics, animal model, and DNA chip technology in order to identify and characterize the genes that are responsible for human diseases. In short, the genome consists of the complete genetic material of an organism. There are about 3 million base pairs (23 chromosomal pairs) in the human genome. In 1992, Watson defined genome as a functional unit with size equivalent to the genome of E.coli. Nearly 1.5 percent of the 3 million base pairs are genes, which carry information via cells for protein synthesis.

Genome Size for Various Organisms

The size of the genome may necessarily not relate to the number of chromosomes.

Examples include :
(1) E.coli genome consists of 4.6 million base pairs arranged into a single haploid chromosomal cell;
(2) Yeast genome consists of 14 million base pairs arranged into 16 chromosomes;
(3) The fruit fly consists of 160 million base pairs organized in 4 chromosomes;
(4) Corn (Zea mays) consists of 5 billion base pairs organized in 10 chromosomes. The E.coli and yeast are used as tools to sequence the human genome and play an important role in the fermentation process of fuels, biopharmaceuticals, and chemicals.

Human Genome Project

Human genome mapping started in the year 1970 with markers spaced at intervals of 20 cM. This interval was thought sufficient for the mapping process (1 cM is equal to 10 6 base pairs). Later it was realized that 2 cM markers were required for the mapping process. The markers exhibited large polymorphism, so that any changes in the genes were easily detected. Polymorphism is defined as the difference in amino acid protein sequences or the corresponding DNA bases sequence. The first proposal for Human Genome Project came in the year 1980. The proposal aimed at sequencing and defining all human genes.

However, the proposal was initiated only in the year 1991 and by 2003 the sequencing was complete. The primary objectives of the Human Genome Project were to: (1) sequence human genomes and model organisms; (2) genes identification; (3) develop new technologies to achieve the above objectives.

Gene Mapping and Sequencing

A joint research plan by the U.S. Department of Energy (DOE) and the National Institute of Health (NIH) was initiated to map and sequence nearly 90,000-100,000 human genes. However, as the sequencing project progressed, scientists came to know that only 20,000- 25,000 genes were present. On February 15, 2001, a special issue of Nature published the sequences that were generated by Human Genome Project, while the Celera Genomics sequence was published in the Science on February 16, 2001. By April 2003, 99 percent of the human DNA that contained the genes achieved sequencing to 99.99 percent accuracy, thus concluding the Human Genome Project.

Gene Markers and Nucleotides

Gene markers in the chromosomes are identified by their location in the appropriate chromosomal region. Using gene physical maps, genes location and functions can be identified. Once the location of the genes is identified, robots undertake many operations to determine the nucleotide sequences. DNA nucleotides for some regions cannot be either determined or be clear due to the inability of biochemical methods and enzymes to completely or accurately depolymerize DNA into individual nucleotides. This is tedious process requiring manual modifications and intensive labor. The process completion step takes relatively more time as it is responsible for piecing the sequenced fragments together in an organized order. This step requires proper institution and algorithms in order to reconstruct the sequenced data into DNA pieces. The experiment was repeated when the DNA sequences did not overlap. The repetition aimed to achieve 99.99 percent accuracy for completed DNA sequences.


During the first 7 years of the Human Genome Project, around 2 to 3 percent of human genomes were sequenced. Genome sequencing for microorganisms such as E.coli, Bacillus subtilis, and S.cerevisiae gained industrial importance. By the near end of July 2008, around 800 genomes, for organisms from all kingdoms, were completely sequenced on one hand and on the other hand another 1500 genomes were at different stages of completion.

Human genome sequencing and mapping has vastly increased the information pertaining to developmental biology, human metabolism, human diseases, robotics, DNA microchip technology, PCR, and in comparative genomics.

About Author / Additional Info:
A freelancer