Data Release and Access Principles and Policy

The human genome, the common heritage of all humanity, is arguably the most valuable dataset the biomedical research community has ever known. It holds long-sought secrets of human development, physiology, and medicine.

The highest priority of the International Human Genome Sequencing Consortium is ensuring that sequencing data from the human genome is available to the world's scientists rapidly, freely and without restriction.

Since the sequencing phase of the Human Genome Project (HGP) began five years ago, all of the data generated by participants has been deposited in publicly available databases every 24 hours.

Translating the text of the human genome into practical applications that will alleviate suffering is one of the greatest challenges facing humankind. This mission will require the work of tens of thousands of scientists throughout the world. No scientist wanting to advance this cause should be denied the opportunity to do so for lack of access to raw genomic data. Delaying the release of either unfinished or finished genomic DNA sequence data serves no scientific or societal purpose.

Early Results

Free and unfettered access to raw sequence data has sparked an explosion of scientific discovery in both academia and industry - even before the sequence's final assembly and completion.

  • Identification of Disease Genes: Already, public access to genome data has enabled researchers to identify at least 30 novel disease genes, including those for susceptibility to some forms of breast cancer, epilepsy, muscular dystrophy and deafness. Hundreds more are certain to follow in the next few years.

  • Identification of Drug Targets: Access to the entire complement of genes and proteins embedded in the human genome will greatly expand the search for suitable drug targets. A systematic search of the human genome database has already identified many regions with DNA sequences that closely resemble known disease genes. These regions are likely to represent promising new drug targets.

Policy Statements Relevant to the Release of and Access to Genomic Sequence Data


Determination of Exceptional Circumstances (DEC) under 35 USC 202(a)(ii) and 37 CFR 401.3(a)(2) and (e) for the NIH Full-Length cDNA Initiative Contract and its Subcontracts, Harold Varmus, M.D., as NIH Director, (16 November 1999)

  • Full text available upon request

  • Excerpt: "The success of this initiative depends on the timely availability of all the clones and sequences generated by this initiative to ensure a publicly accessible gold standard set of reagents for biomedical research. The cDNA libraries, clones, and sequences generated as part of the NIH Full-Length cDNA Initiative will most effectively contribute to a resource for the research community if they are made publicly available without restriction and in a timely manner. The sharing of materials and data in a timely manner has been an essential element in the rapid progress that has been made in biomedical research."

NIH Principles and Guidelines for Sharing of Biomedical Research Resources [] (December 1999)

  • Excerpt: "Progress in science depends upon prompt access to the unique research resources that arise from biomedical research laboratories throughout government, academia, and industry. Ideally, these new resources flow to others who advance science by conducting further research. ... The goal is widespread, timely distribution of tools for further discovery. When research tools are used only within one or a small number of institutions, there is a great risk that fruitful avenues of research will be neglected."

International Strategy Meetings on Human Genome Sequencing:

National Research Council

Mapping and Sequencing the Human Genome (1988)

  • Excerpt 1 (p. 7): "Considerable data will be generated from the mapping and sequencing project. Unless this information is effectively collected, stored, analyzed, and provided in an accessible form to the general research community worldwide, it will be of little value. ... Because access to all sequences and materials generated by these publicly-funded projects should and even must be made freely available, two different types of centralized facilities will be needed: (1) information centers to collect and distribute mapping and sequencing data, and (2) centers to collect and distribute materials such as DNA clones and human cell lines."

  • Excerpt 2 (p. 91): "Absolutely essential to the success of the project will be cooperation between laboratories and centers - within the United States and internationally - and the ready availability of data and materials to all participants. This committee believes that human genome sequences should be a public trust and therefore should not be subject to copyright."
Office of Technology Assessment

Mapping Our Genes - Genome Projects: How Big? How Fast? (April 1988)

  • Excerpt: " ... genome projects raise no new questions of patent or copyright law. Genome projects would be subject to the same statutes and executive orders as other scientific efforts. There is a clear role for congressional oversight, however, in ensuring that data are shared promptly and fully. ... If private corporations do form to develop map and sequence data and research materials, they will operate at private expense. If they are successful, scientists will have new information, services, and materials available for a price. ... Corporate efforts need not entail restricted access to information. ... The essential point is not whether a grantee or a contractor is a university or corporation, but whether the research results will be widely shared."
Human Genome Organization (HUGO)


