Origins of New Genes and Pseudogenes Increases Genome Complexity of Organism
Author: Lalbahadur Singh


New gene origin is a driving force of evolutionary innovation in all organisms. Current knowledge of the origin of new genes encompasses information regarding both protein coding genes and RNA genes. All of these genes are transcribed, but only protein coding genes are translated into proteins. The study of pseudogenes, originally defined as sequences that resemble known genes but cannot produce a functional protein, has revealed not only how often genes degenerate, but also that many sequences once believed to be degenerating protein coding genes are in fact functional RNA genes. Mechanisms of New Gene Generation Over the years, scientists have proposed several mechanisms by which new genes are generated. These include gene duplication, transposable element protein domestication, lateral gene transfer, gene fusion, gene fission, and de novo origination.

Gene Duplication

Gene duplication was the first mechanism of gene generation to be suggested, and this process does indeed appear to be the most common way of creating new genes. The mechanisms that generate duplicate genes are diverse, and more details about these mechanisms are continually being discovered. These mechanisms include whole genome duplications originating through non disjunction, tandem duplications originating through unequal crossover, retropositions originating through the retrotranscription of an RNA intermediate, transpositions involving transposable elements and duplications occurring after rearrangements and subsequent repair of staggered breaks. Such duplications involve not only protein coding genes, but also noncoding RNA genes. For example, a novel class of retroduplicates includes snoRNAs, which are a class of RNA genes that are involved in ribosomal RNA processing.

Transposable Element Protein Domestication

Transposable elements (TEs) are so called selfish segments of DNA that encode proteins that allow these segments to copy or move themselves within a genome. There are two types of TEs: DNA transposons and retrotransposons. DNA transposons are able to excise themselves out of the genome and be inserted somewhere else, whereas retrotransposons copy themselves through an RNA intermediate. Similar to viral insertions in the genome, TE insertions cause mutations and contribute to increased genome size, but they usually do not encode cellular proteins.

Interestingly, one way for a genome to acquire new genes is by recruiting transposable element proteins and using them as cellular proteins. Such events are called domestications of TE proteins.

Lateral Gene Transfer

The term lateral gene transfer to refer to the case in which a gene does not have a vertical origin (i.e., direct inheritance from parent to offspring) but instead comes from an unrelated genome. It is well known that this sort of transfer occurs between bacteria, and that it also has taken place between the genomes of the cellular organelles (mitochondria and chloroplasts) and the nuclear genomes. However, more recent transfer events between organelles and/or endosymbiont bacteria continue to occur. For example, largescale sequencing efforts have revealed that much of the genome of the intracellular endosymbiont Wolbachia pipentis was integrated into Drosophila species. However, the mechanism for these transfers remains largely unknown, and the functional consequences of some of these transfers have yet to be explored.

Gene Fusion and Fission

Existing genes can also fuse (i.e., two or more genes can become part of the same transcript) or undergo fission (i.e., a single transcript can break into two or more separate transcripts), thereby forming new genes. Interestingly, it has been observed that chimeric fusion genes sometimes involve two copies of the same gene (e.g., the alcohol dehydrogenase gene, and when that happens, the resulting genes undergo parallel evolution, in which they shift away from the functions of their parental genes.

De Novo Gene Origin

New genes can additionally originate de novo from noncoding regions of DNA. Indeed, several novel genes derived from noncoding DNA have recently been described in Drosophila. For these recently originated Drosophila genes with likely protein coding abilities, there are no homologues in any other species. Note, however, that the de novo genes described in various species thus far include both protein coding and non-coding genes. These new genes sometimes originate in the X chromosome, and they often have male germline functions. The action of all the mechanisms described in the previous sections leads to exon shuffling (i.e., the observation that many genes share exons).

What Happens to New Genes?

All these new sequences add to the complexity and diversity of genomes. As with any mutation, when new genes become fixed in a genome, they add to the differences between species and serve as the raw material for evolution. This is easy to see in the case of gene duplication. Gene duplication results in two or more copies of a gene: one that can maintain its original function in the organism, and others that can be played with to take on new functions. As a consequence, new duplicates are a main source of genome innovation and often evolve under positive selection, in which rapid changes in the protein encoded by the new gene occur to gain a new function. This process is referred to as neo functionalization of the new gene.

The Origin and Fate of Pseudogenes

Pseudogenes originate through the same mechanisms as protein coding genes, followed by the subsequent accumulation of disabling mutations (e.g., nucleotide insertions, deletions, and/or substitutions) that disrupt the reading frame or lead to the insertion of a premature stop codon. Pseudogenes can be broadly classified into two categories: processed and nonprocessed. Nonprocessed pseudogenes usually contain introns, and they are often located next to their paralogous parent gene. Processed pseudogenes are thought to originate through retrotransposition; accordingly, they lack introns and a promoter region, but they often contain a polyadenylation signal and are flanked by direct repeats. Errors in reverse transcription and the lack of an appropriate regulatory environment often lead to the degeneration of processed copies of genes. The abundance of pseudogenes in a given genome usually depends on rates of gene duplication and loss.


1. Chandrasekaran , C., Betran , E. (2008). Origins of new genes and pseudogenes. Nature Education, 1(1), 181.

About Author / Additional Info:
Right now I am pursuing Ph.D. in biotechnology from Indian agricultural research institute (IARI), new delhi. Currently I am working in the area of miRNAs in pulse crop.