NHGRI logo

Human Genome Reference Sequence

updated: July 13, 2024


A human genome reference sequence is an accepted representation of the human genome sequence that is used by researchers as a standard for comparison to DNA sequences generated in their studies. The scientists responsible for assembling and updating such reference sequences aim to provide the highest-quality, best possible consensus representations of the sequence and structural diversity found in the human genome among populations. The genome reference sequence provides a general framework and is not the DNA sequence of a single person.


At the completion of the Human Genome Project back in 2003, scientists achieved a major milestone – a DNA sequence that covered 99 percent of the human genome's gene-containing regions and was of 99.99 percent accurate. This set of data, as intended, was a “reference” for the human genome, not representing one single person over the whole genome, but a collection of different – albeit anonymous – people. It was an example of a human genome off which scientists could base research studies or compare other human genomes. Since then, researchers have worked to fill the gaps an improve the inaccuracies in the human genome sequence. They have also worked tremendously hard to represent places in the genome where humans are different. The Human Genome Reference Sequence is not an example of one human but represents many different varieties of human genomes. 

Kris Wetterstrand
Kris A. Wetterstrand, M.S.

Program Operations Lead

Division of Extramural Operations