DNA Sequencing Fact Sheet

National Human Genome Research Institute

National Institutes of Health
U.S. Department of Health and Human Services

DNA Sequencing

What is DNA sequencing?

Finding a single gene amid the vast stretches of DNA that make up the human genome - three billion base-pairs' worth - requires a set of powerful tools. The Human Genome Project (HGP) was devoted to developing new and better tools to make gene hunts faster, cheaper and practical for almost any scientist to accomplish.

These tools include genetic maps, physical maps and DNA sequence - which is a detailed description of the order of the chemical building blocks, or bases, in a given stretch of DNA. Indeed, the monumental achievement of the HGP was its successful sequencing of the entire length of human DNA, also referred to as the human genome.

Scientists need to know the sequence of bases because it tells them the kind of genetic information that is carried in a particular segment of DNA. For example, they can use sequence information to determine which stretches of DNA contain genes, as well as to analyze those genes for changes in sequence, called mutations, that may cause disease.

Top of page

What sequencing methods were developed?

The first methods for sequencing DNA were developed in the mid-1970s. At that time, scientists could sequence only a few base pairs per year, not nearly enough to sequence a single gene, much less the entire human genome. By the time the HGP began in 1990, only a few laboratories had managed to sequence a mere 100,000 bases, and the cost of sequencing remained very high. Since then, technological improvements and automation have increased speed and lowered cost to the point where individual genes can be sequenced routinely, and some labs can sequence well over 100 million bases per year.

Beginning in the late 1990s, the scientific community witnessed a remarkable climax of accomplishments related to DNA sequencing. In addition to the historic sequencing of the human genome, sequences have now been generated for the genomes of several key model organisms, including the mouse (Mus musculus); the rat (Rattus norvegicus); two fruit flies (Drosophila melanogaster and D. pseudoobscura); two roundworms (Caenorhabditis elegans and C. briggsae); yeast (Saccharomyces cerevisiae) and several other fungi; a malaria-carrying mosquito (Anopheles gambiae) along with a malaria-causing parasite (Plasmodium falciparum); two sea squirts (Ciona savignyi and C. intestinalis); a long list of microbes; and a couple of plants, including mustard weed (Arabidopsis thaliana) and rice (Oryza sativa). Sequencing work is well underway on the honey bee (Apis mellifera), and is just getting started or expected to begin soon on the chimpanzee (Pan troglodytes), the cow (Bos taurus), the dog (Canis familiaris) and the chicken (Gallus gallus).The relative genetic simplicity of many of these model organisms make them ideal terrain for future technology development.

Although providing a single reference sequence of the human genome is an extraordinary achievement, further advances in sequencing technology are necessary so large amounts of DNA can be manipulated and compared with other genomes quickly and cheaply. Comparing differences among long stretches of DNA - one million bases or more - taken from many individuals should yield an enormous amount of information about the role of inheritance in disease susceptibility, response to environmental influences and even evolution.

Top of page

What did scientists discover for future research?

The Human Genome Project's (HGP) successful sequencing of the human genome has provided scientists with a virtual blueprint of the human being. However, this accomplishment should be viewed not as an end in itself, but rather as a starting point for even more exciting research. Armed with the human genome sequence, researchers are now trying to unravel some of biology's most complicated processes: how a baby develops from a single cell, how genes coordinate the functions of tissues and organs, how disease predisposition occurs and how the human brain works.

DNA sequence information derived by the HGP laboratories is freely accessible to scientists through GenBank [ncbi.nih.gov], a database run by the National Institutes of Health and the National Library of Medicine's National Center for Biotechnology Information [ncbi.nih.gov].

Top of page

Last Reviewed: December 27, 2011