Until the early 1970's, DNA was the most difficult cellular molecule for the biochemists to analyze. But now, DNA is the easiest molecule to analyze, isolate or amplify. DNA can be isolated from a specific region of the genome, its unlimited number of copies can be produced and its nucleotide sequence can also be determined. When the Human Genome Project was in its peak, sequencing factories were generating DNA sequences at a rate of 1000 nucleotides per second. In 1990, Human Genome Project initiated as a joint effort of U.S. Department of Energy and the National Institutes of Health. Dr. Jim Watson was the director of this project. Many collaborating countries, UK, USA, Japan, Germany, Russia and China took part in decoding around 3 billion bases of DNA. The working draft of the entire human genome (covers >90% of the genome) was completed in June, 2000. In February, 2001 the analyses of the working draft were published. April 2003 saw the completion of sequencing of HGP and the project was declared as completed two years ahead of the schedule.
The main goals of this project were to identify all the 35,000 genes in human DNA and to determine the sequences of the 3 billion chemical base pairs that make up human DNA. This information was stored in databases through various tools. The database is free and can be viewed easily with the help of bioinformatic tools.
Human Genome Project draft signifies 3 billion chemical nucleotide bases (A, C, T, and G) in 23 chromosomes. The average gene consists of 3000 bases, but sizes vary greatly, with the largest known human gene being dystrophin at 2.4 million bases. The total number of genes is estimated at around 35,000, i.e. around 2% of DNA has gene segment rest of the DNA is known as junk DNA as they have no coding regions. Almost all (99.9%) nucleotide bases are exactly the same in all people. The human genome's gene-dense "urban centers" are predominantly composed of the DNA building blocks G and C. In contrast, the gene-poor "deserts" are rich in the DNA building blocks A and T. GC- and AT-rich regions usually can be seen through a microscope as light and dark bands on chromosomes. Stretches of up to 30,000 C and G bases repeating over and over often occur adjacent to gene-rich areas, forming a barrier between the genes and the junk DNA. Chromosome 1 has maximum genes (2968), and the Y chromosome has the minimum number of genes (231). The repeated sequences that do not code for proteins ("junk DNA") make up at least 50% of the human genome. The repetitive sequences are thought to have no direct functions, but they shed light on the structure of chromosome. These repeats reshape the genome by rearranging it, creating entirely new genes, and modifying and reshuffling existing genes.
If we compare human genome with other organisms, then we find that unlike human's seemingly random distribution of gene-rich areas, many other organisms' genomes are more uniform, with genes evenly spaced throughout. Humans have on average three times as many kinds of proteins as the fly or worm because of mRNA transcript "alternative splicing" and chemical modifications to the proteins. This process can yield different protein products from the same gene. Humans share most of the same protein families with worms, flies, and plants, but the number of gene family members has expanded in humans, especially in proteins involved in development and immunity.
Scientists have found that about 3 million locations in DNA have single-base differences (SNPs). This information promises to revolutionize the processes of finding chromosomal locations for disease-associated sequences and tracing human history. The ratio of germline (sperm or egg cell) mutations is 2:1 in males vs females. Researchers point to several reasons for the higher mutation rate in the male germline, including the greater number of cell divisions required for sperm formation than for eggs. This led to the discovery of whole new classes of proteins and genes, while revealing that many proteins have been much more highly conserved in evolution than had been suspected. It has also provided new tools for determining the functions of proteins and of individual domains within proteins, revealing a host of unexpected relationships between them. By making large amounts of protein available, it has yielded an efficient way to mass produce protein hormones and vaccines.
After sequencing the complete genome of humans, we still have many things to discover. We are still unaware about gene number, exact location and functions of genes, gene regulation, organization of DNA sequence, chromosomal structure and organization, types, amount, distribution, information content, and functions of non-coding DNA, coordination of gene expression, protein synthesis, and post-translational events, interaction of proteins in complex molecular machines, evolutionary conservation among organisms, proteomes (total protein content and function) present in the organisms, correlation of SNPs (single-base DNA variations among individuals) with health and disease, complex systems biology including microbial consortia useful for environmental restoration and developmental genetics and genomics.
Benefits of knowing the sequence of human genome are as follows:-
a) Molecular Medicine can be synthesized which can improve the diagnosed disease. It detects the genetic predispositions to disease. Based on individual genetic profiles, the drugs can be designed (pharmacogenomics).
b) Microbial Genomics can rapidly detect and treat the pathogens. It helps in the development of energy source (biofuels). It can monitor the environmental condition to detect the pollutants and protect us from biological and chemical warfare. The toxic wastes are removed safely and efficiently.
c) Risk Assessment evaluates the health risks faced by individuals who may be exposed to radiation and carcinogenic substances.
d) Bioarchaeology, Anthropology, Evolution, and Human Migration study the evolution through germline mutations in lineages and migration of different population groups based on the maternal inheritance. The mutations on the Y chromosome are studied to trace lineage and migration of males.
e) Forensics- the identification of potential suspects whose DNA may match evidence left at crime scenes becomes easy with the identified sequences. This can exonerate persons wrongly accused of crimes. It also establishes paternity and other family relationships. It helps in organ transplantation by matching the organ donors with recipients.
f) Agriculture, Livestock Breeding, and Bioprocessing are flourished by growing disease-, insect-, and drought-resistant crops. The breed becomes healthier, more productive, and disease-resistant farm animals. It incorporates edible vaccines incorporated into food products and develop new environmental cleanup uses for plants like tobacco.
Many ethical, legal and social issues also arise due to this development. It hinders the privacy and confidentiality of genetic information. Fairness in the use of genetic information by insurers, employers, courts, schools, adoption agencies, and the military, among others. The psychological impact leads to the discrimination among different individuals due to an individual's genetic differences. The education of doctors and other health-service providers, people identified with genetic conditions, and the general public about capabilities, limitations, and social risks; and implementation of standards and quality controls measures.
Uncertainties associated with gene tests for susceptibilities and complex conditions, for e.g., heart disease, diabetes, and Alzheimer's disease. Fairness in access to advanced genomic technologies. Conceptual and philosophical implications regarding human responsibility, free will vs genetic determinism, and concepts of health and disease. Health and environmental issues concerning genetically modified (GM) foods and microbes. Commercialization of products including property rights (patents, copyrights, and trade secrets) and accessibility of data and materials.
After the completion of HGP, we are now looking forward to hapmap project and chart genetic variation within the human genome and exploring microbial genomes for energy and the environment. The Hapmap began in 2002 and took 3 years to construct a map of the patterns of SNPs (single nucleotide polymorphisms) that occur across populations in Africa, Asia, and the United States. Researchers hope that dramatically decreasing the number of individual SNPs to be scanned will provide a shortcut for identifying the DNA regions associated with common complex diseases. Map may also be useful in understanding how genetic variation contributes to responses in environmental factors.
About Author / Additional Info:
Email Id: firstname.lastname@example.org