NHGRI logo

Background on Comparative Genomic Analysis

December 2002

Sequencing the genomes of the human, the mouse and a wide variety of other organisms - from yeast to chimpanzees - is driving the development of an exciting new field of biological research called comparative genomics.

By comparing the human genome with the genomes of different organisms, researchers can better understand the structure and function of human genes and thereby develop new strategies in the battle against human disease. In addition, comparative genomics provides a powerful new tool for studying evolutionary changes among organisms, helping to identify the genes that are conserved among species along with the genes that give each organism its own unique characteristics.

Using computer-based analysis to zero in on the genomic features that have been preserved in multiple organisms over millions of years, researchers will be able to pinpoint the signals that control gene function, which in turn should translate into innovative approaches for treating human disease and improving human health. In addition, the evolutionary perspective may prove extremely helpful in understanding disease susceptibility. For example, chimpanzees do not suffer from some of the diseases that strike humans, such as malaria and AIDS. A comparison of the sequence of genes involved in disease susceptibility may reveal the reasons for this species barrier, thereby suggesting new pathways for prevention of human disease.

Although living creatures look and behave in many different ways, all of their genomes consist of DNA, the chemical chain that makes up the genes that code for thousands of different kinds of proteins. Precisely which protein is produced by a given gene is determined by the sequence in which four chemical building blocks - adenine (A), thymine (T), cytosine (C) and guanine (G) - are laid out along DNA's double-helix structure.

In order for researchers to most efficiently use an organism's genome in comparative studies, data about its DNA must be in large, contiguous segments, anchored to chromosomes and, ideally, fully sequenced. Furthermore, the data needs to be organized for easy access and high-speed analysis by sophisticated computer software. The successful sequencing of the human genome, which is scheduled to be finished in April 2003, and the recent draft assemblies of the mouse and rat genomes have demonstrated that large-scale sequencing projects can generate high-quality data at a reasonable cost. As a result, the interest in sequencing the genomes of many other organisms has risen dramatically.

In addition to mouse (Mus musculus) and human (Homo sapiens), organisms that have been sequenced include: fruit fly (Drosophila melanogaster); roundworm (Caenorhabditis elegans); yeast (Saccharomyces cerevisiae); a malaria-carrying mosquito (Anopheles gambiae) along with the malaria-causing parasite (Plasmodium falciparum); a long list of microbes; and a couple of plants, including rice (Oryza sativa). While not yet complete, the draft sequence of the rat (Rattus norvegicus) is also of sufficiently high quality to conduct many valuable comparative analysis studies.

The fledgling field of comparative genomics has already yielded some dramatic results. For example, a March 2000 study comparing the fruit fly genome with the human genome discovered that about 60 percent of genes are conserved between fly and human. Or, to put it more simply, the two organisms appear to share a core set of genes. Researchers have found that two-thirds of human cancer genes have counterparts in the fruit fly. Even more surprisingly, when scientists inserted a human gene associated with early-onset Parkinson's disease into fruit flies, they displayed symptoms similar to those seen in humans with the disorder, raising the possibility that the tiny insects could serve as a new model for testing therapies aimed at Parkinson's.

In September 2002, the cow (Bos taurus), the dog (Canis familiaris) and the ciliate Oxytricha (Oxytricha trifallax) joined the "high priority" list of organisms that the National Human Genome Research Institute (NHGRI) decided to consider for genome sequencing as capacity becomes available. Other high-priority animals include the chimpanzee (Pan troglodytes), the chicken (Gallus gallus), the honey bee (Apis mellifera) and even a sea urchin (Strongylocentrotus purpuratus). With sequencing projects on the human, mouse and rat genomes progressing rapidly and nearing completion, NHGRI-supported sequencing capability is expected to be available soon for work on other organisms.

NHGRI created a priority-setting process in 2001 to make rational decisions about the many requests being brought forward by various communities of scientists, each championing the animals used in its own research. The priority-setting process, which does not result in new grants for sequencing the organisms, is based on the medical, agricultural and biological opportunities expected to be created by sequencing a given organism.

In addition to its implications for human health and well being, comparative genomics may benefit the animal world as well. As sequencing technology grows easier and less expensive, it will likely find wide applications in zoology as a tool to tease apart the often-subtle differences among animal species. Such efforts might possibly lead to the rearrangement of some branches on the evolutionary tree, as well as point to new strategies for conserving or expanding rare and endangered species.

Geoff Spencer
Phone: (301) 402-0911

Last updated: May 23, 2012