A base pair consists of two complementary DNA nucleotide bases that pair together to form a “rung of the DNA ladder.” DNA is made of two linked strands that wind around each other to resemble a twisted ladder — a shape known as a double helix. Each strand has a backbone made of alternating sugar (deoxyribose) and phosphate groups. Attached to each sugar is one of four bases: adenine (A), cytosine (C), guanine (G) [GWA-NeeN] or thymine (T). The two strands are held together by hydrogen bonds between pairs of bases: adenine pairs with thymine, and cytosine pairs with guanine.
One copy of the human genome consists of approximately 3 billion base pairs of DNA, which are distributed across 23 chromosomes. Human chromosomes range in size from about 50 million to 300 million base pairs. Because the bases exist as pairs, and the identity of one of the bases in the pair determines the other member of the pair, scientists do not have to report both bases of the pair — which is why DNA sequence is typically represented as single strings of letters. DNA sequencing involves determining the exact order of the base pairs across a DNA segment of interest or across an entire genome. A signature goal of the Human Genome Project was to generate the first high-quality sequence of the human. The effort was successful in generating a such a sequence for over 90% of the human genome, but it took nearly two more decades to sequence the remaining bits of the human genome — which were heavily enriched for highly repetitive and difficult-to-sequence stretches of DNA.
Chief, Office of Communications
National Human Genome Research Institute, NIH