Skip to main content

Spider web

Archived Page

This page has been archived and is provided for historical reference purposes only. The content and links are no longer maintained and may now be outdated.

NHGRI-DOE Guidance on Human Subjects Issues in Large-Scale DNA Sequencing

Policy Updates

August 17, 1996


The Human Genome Project (HGP) is now entering into large-scale DNA sequencing. To meet its complete sequencing goal, it will be necessary to recruit volunteers willing to contribute their DNA for this purpose. The guidance provided in this document is intended to address ethical issues that must be considered in designing strategies for recruitment and protection of DNA donors for large-scale sequencing.

Nothing in this document should be construed to differ from, or substitute for, the policies described in the Federal Regulations for the Protection of Human Subjects [45CFR46 (NIH) and 10CFR745 (DOE)]. Rather, it is intended to supplement those policies by focusing on the particular issues raised by large-scale human DNA sequencing. This statement addresses six topics: (1) benefits and risks of genomic DNA sequencing; (2) privacy and confidentiality; (3) recruitment of DNA donors as sources for library construction; (4) informed consent; (5) IRB approval; and (6) use of existing libraries.

The guidance provided in this statement is intended to afford maximum protection to DNA donors and is based on the belief that protection can best be achieved by a combination of approaches including:

  • ensuring that the initial version of the complete human DNA sequence is derived from multiple donors.
  • providing donors with the opportunity to make an informed decision about whether to contribute their DNA to this project.
  • taking effective steps to ensure the privacy and confidentiality of donors.

Benefits and Risks of Genomic DNA Sequencing

The HGP offers great promise for the improvement of human health. As a consequence of the HGP, there will be a more thorough understanding of the genetic basis of human biology and of many diseases. This, in turn, will lead to better therapies and, perhaps more importantly, prevention strategies for many of those diseases. Similarly, as the technology developed by the HGP is applied to understanding the biology of other organisms, many other human activities will be affected including agriculture, environmental management, and biologically-based industrial processes.

While the HGP offers great promise to humanity, there will be no direct benefit, in either clinical or financial terms, to any of the individuals who choose to donate DNA for large-scale sequencing. Rather, the motivation for donation is likely to be an altruistic willingness to contribute to this historic research effort.

However, individuals who donate DNA to this effort may face certain risks. Information derived from the donors will become available in public databases. Such information may reveal, for example, DNA sequence-based information about disease susceptibility. If the donor becomes aware of such information, it could lead to emotional distress on her/his part. If such health-related information becomes known to others, discrimination against the donor (e.g. in insurance or in employment) could result. Unwanted notoriety is another potential risk to donors. Therefore, those engaged in large-scale sequencing must be sensitive to the unique features of this type of research and ensure that both the protections normally afforded research subjects and the special issues associated with human genomic DNA sequencing are thoroughly addressed.

While some risks to donors can already be identified, the probability of adverse events materializing appears to be low. However, the risks of harm to individuals will increase if confidentiality is not maintained and/or the number of donors is limited to a very few individuals. Either, or both, of these situations would increase the possibility of a donor's identity being revealed without his/her knowledge or permission.

A final issue to consider is characterized in a statement taken from the OPRR Guidebook (1) which points out that "some areas [of genetic research] present issues for which no clear guidance can be given at this point, either because enough is not known about the risks presented by the research, or because no consensus on the appropriate resolution of the problem exists. It is anticipated that the DNA sequence information produced by the Human Genome Project will be used in the future for types of research which cannot now be predicted and the risks of which cannot be assessed or disclosed.

Privacy & Confidentiality

In general, one of the most effective ways of protecting volunteers from the unexpected, unwelcome or unauthorized use of information about them is to ensure that there are no opportunities for linking an individual donor with information about him/her that is revealed by the research. By not collecting information about the identity of a research subject and any biological material or records developed in the course of the research, or by subsequently removing all identifiers ("anonymizing" the samples), the possibility of risk to the subject stemming from the results of the research is greatly reduced. Large-scale DNA sequence determination represents an exception because each person's DNA sequence is unique and ultimately, there is enough information in any individual's DNA sequence to absolutely identify her/him. However, the technology that would allow the unambiguous identification of an individual from his/her DNA sequence is not yet mature. Thus, for the foreseeable future, establishing effective confidentiality, rather than relying on anonymity, will be a very useful approach to protecting donors.

Investigators should introduce as many disconnects between the identity of donors and the publicly available information and materials as possible. There should not be any way for anyone to establish that a specific DNA sequence came from a particular individual, other than re-sampling an individual's DNA and comparing it to the sequence information in the public database. In particular, no phenotypic or demographic information about donors should be linked to the DNA to be sequenced. (2) For the purposes of the HGP such information will rarely be useful, and recording such information could result in possible misuse and compromise donor confidentiality.

Confidentiality should be "two way." Not only should others be unable to link a DNA sequence to a particular individual, but no individual who donates DNA should be able to confirm directly that a particular DNA sequence was obtained from their DNA sample. (3) This degree of confidentiality will preclude the possibility of re-contacting DNA donors, providing another degree of protection for them. It should be clear to both investigators and to donors that the contact involved in obtaining the initial specimen will be the only contact. (4) Another approach for protecting all DNA donors is to reduce the incentive for wanting to know the identities of particular donors. If the initial human sequence is a "mosaic" or "patchwork" of sequenced regions derived from a number of different individuals, rather than that of a single individual, there would be considerably less interest in who the specific donors were. Although there may be scientific justification that each clone library used for sequencing should be derived from one person, there is no scientific reason that the entire initial human DNA sequence should be that of a single individual. As approximately 99.9 percent of the human DNA sequence is common between any two individuals, most of the fundamental biological information contained in the human DNA sequence is common to all people.

To increase the likelihood that the first human DNA sequence will be an amalgam of regions sequenced from different sources, a number of clone libraries must be made available. Although a number of large insert libraries have been made, most do not meet all of the standards set in this document; therefore, these libraries should be used as substrates for large-scale sequencing only under circumscribed conditions (see section 6, below). Starting immediately, new libraries will be developed that have the advantage of being constructed in accordance with the ethical principles discussed in this document; they may also confer some additional scientific benefit. Such libraries are critical for the long-range needs of the HGP.

Source/Recruitment of DNA Donors for Library Construction

Another implication of the fact that 99.9 percent of the human DNA sequence is shared by any two individuals is that the backgrounds of the individuals who donate DNA for the first human sequence will make no scientific difference in terms of the usefulness and applicability of the information that results from sequencing the human genome. At the same time, there will undoubtedly be some sensitivity about the choice of DNA sources. There are no scientific reasons why DNA donors should not be selected from diverse pools of potential donors. (5)

There are two additional issues that have arisen in considering donor selection. These warrant particular discussion:

  • It is recognized that women have historically been underrepresented in research, so it can be anticipated that concerns might arise if males (sperm DNA) were used exclusively as the source of DNA for large-scale sequencing. Although there would be no scientific basis for concern, because even in the case of a male source, half of the donor's DNA would have come from his mother and half from his father, nevertheless perceptions are not to be dismissed. While the choice of donors will not be dictated to investigators, it is expected that, because multiple libraries will be produced, a number of them will be made from female sources while others will be made from male sources.

  • Staff of laboratories involved in library construction and DNA sequencing may be eager to volunteer to be donors because of their interest and belief in the HGP. However, proximity to the research may create some special vulnerabilities for laboratory staff members. It is also possible that they will feel pressure to donate and there may be an increased likelihood that confidentiality would be breached. Finally, there is a potential that the choice of persons so closely involved in the research may be interpreted as elitist. For all of these reasons, it is recommended that donors should not be recruited from laboratory staff, including the principal investigator.

Informed Consent

Obtaining informed consent specifically for the purpose of donating DNA for large-scale sequencing raises some unique concerns. Because anonymity cannot be guaranteed and confidentiality protections are not absolute, the disclosure process to potential donors must clearly specify what the process of DNA donation involves, what may make it different from other types of research, and what the implications are of one's DNA sequence information being a public scientific resource.

Federal regulations (45CFR46 and 10CFR745) require the disclosure of a number of issues in any informed consent document. They include such issues as potential benefits of the research, potential risks to the donor, control and ownership of donated material, long-term retention of donated material for future use, and the procedures that will be followed. In addition, there are several other disclosures that are of special importance for donors of DNA for large-scale sequencing. These include:

  • The meaning of confidentiality and privacy of information in the context of large-scale DNA sequencing, and how these issues will be addressed.
  • The lack of opportunity for the donor to later withdraw the libraries made from his/her DNA or his/her DNA sequence information from public use.
  • The absence of opportunity for information of clinical relevance to be provided to the donor or her/his family.
  • The possibility of unforeseen risks.
  • The possible extension of risk to family members of the donor or to any group or community of interest (e.g., gender, race, ethnicity) to which a donor might belong.

Many academic human genetics units have considerable experience in dealing with research subjects and obtaining informed consent, while the laboratories that are likely to be involved in making the libraries for sequencing have, in general, much less experience of this type. Therefore, library makers are encouraged to establish a collaboration with one or more human genetics units, with the latter being responsible for recruiting donors, obtaining informed consent, obtaining the necessary biological samples, and providing a blinded sample to the library maker. Collaboration with tissue banks may be considered as long as these banks are collecting tissues in accordance with this guidance. The library maker should have no contact with the donor and no opportunity to obtain any information about the donor's identity.

IRB Approval

Effective immediately, projects to construct libraries for large-scale DNA sequencing must obtain Institutional Review Board (IRB) approval before work is initiated. IRBs should carefully consider the unique aspects of large-scale sequencing projects. Some of the informed consent provisions outlined may be somewhat at odds with the usual and customary disclosures found in most protocols involving human subjects and which IRBs usually consider. For example, research subjects usually are given the opportunity to withdraw from a research project if they change their minds about participating. In the case of donors for large-scale sequencing, it will not be possible to withdraw either the libraries made from their DNA or the DNA sequence information obtained using those libraries once the information is in the public domain. By the time a significant amount of DNA sequence data has been collected, the libraries, as well as individual clones from them, will have been widely distributed and the sequence information will have been deposited in and distributed from public databases. In addition, there will be no possibility of returning information of clinical relevance to the donor or his/her family.

Use of Existing Libraries for Large-Scale Sequencing

Many of the existing libraries (including those derived from anonymous donors) were not made in complete conformity with the principles elaborated above. The potential risks that may result from their use will be minimized by the rapid introduction of several new libraries constructed in accordance with this guidance, which NHGRI and DOE are taking steps to initiate. This will ensure that the existing libraries will only contribute small amounts to the first complete human DNA sequence. In the interim, existing libraries can continue to be used for large-scale sequencing, only if IRB approval and consent for "continued use" are obtained (6) and approval by the funding agency is granted.

It is important that in obtaining consent for continued use of existing libraries, no coercion of the DNA donor occur. It is therefore recommended that consideration be given to whether it is appropriate for the individual who previously recruited the donor to recontact him/her to obtain this consent. In some cases an IRB may determine that the recontact should be made by a third party to assure that the donors are fully informed and allowed to choose freely whether their DNA can continue to be used for this purpose.


This document is intended to provide guidance to investigators and IRBs who are involved in large-scale sequencing efforts. It is designed to alert them to special ethical concerns that may arise in such projects. In particular, it provides guidance for the use of existing and the construction of new DNA libraries. Adhering to this guidance will ensure that the initial version of the complete human sequence is derived from multiple, diverse donors; that donors will have the opportunity to make an informed decision about whether to contribute their DNA to this project; and that effective steps will be taken by investigators to ensure the privacy and confidentiality of donors. Investigators funded by NHGRI and DOE to develop new libraries for large-scale human DNA sequencing will be required to have their plans for the recruitment of DNA donors, including the informed consent documents, reviewed and approved by the funding agency before donors are recruited. Investigators involved in large-scale human sequencing will also be asked to observe those aspects of this guidance that pertain to them.

Approved By:

Francis S. Collins, M.D., Ph.D., Director, NHGRI, NIH
Aristides N. Patrinos, Ph.D., Associate Director, OHER, DOE

August 17, 1996

For more information contact:


  1. Office of Protection from Research Risks, Protecting Human Research Subjects: Institutional Review Board Guidebook (OPRR: U.S. Government Printing Office, 1993)

  2. It is recognized that it will be trivially easy to determine the sex of the donor of the library, by assaying for the presence or absence of Y chromosome in the library.

  3. There are a number of approaches to preventing a DNA donor from knowing that his/her DNA was actually sequenced as part of the HGP. For example, each time a clone library is to be made, an appropriately diverse pool of between five and ten volunteers can be chosen in such a way that none of them knows the identity of anyone else in the pool. Samples for DNA preparation and for preparation of a cell line can be collected from all of the volunteers (who have been told that their specimen may or may not eventually be used for DNA sequencing) and one of those samples is randomly and blindly selected as the source actually used for library construction. In this way, not only will the identity of the individual whose DNA is chosen not be known to the investigators, but that individual will also not be sure that s/he is the actual source.

  4. Although recontacting donors should not be possible, investigators will potentially want to be able to resample a donor's genome. Thus, at the time the initial specimen is obtained, in addition to making a clone library representing the donor's genome, it should also be used to prepare an additional aliquot of high molecular weight DNA for storage and a permanent cell line. Either resource could then be used as a source of the donor's genome in case additional DNA were needed or comparison with the results of the analysis of the cloned DNA were desired.

  5. There has been discussion in the scientific community about the sex of DNA donors. A library prepared from a female donor will contain DNA from the X chromosome in an amount equivalent to the autosomes, but will completely lack Y chromosomal DNA. Conversely, a library prepared from a male donor will contain Y DNA, but both X and Y DNA will only be present at half the frequency of the DNA from the other chromosomes. Scientifically, then, there are both advantages and disadvantages inherent in the use of either a male or a female donor. The question of the sex of the donor also involves the question of the use of somatic or germ line DNA to make libraries. For making libraries, useful amounts of germ line DNA can only be obtained from a male source (i.e., from sperm); it is not possible to obtain enough ova from a female donor to isolate germ line DNA for this purpose. Opinion is divided in the scientific community about whether germ line or somatic DNA should be used for large-scale sequencing. Somatic DNA is known to be rearranged, relative to germ line DNA, in certain regions (e.g. the immunoglobulin genes) and the possibility has been raised that other developmentally-based rearrangements may occur, although no example of the latter has been offered. While some believe that the sequence product should not contain any rearrangements of this sort, others consider this potential advantage of germ line DNA to be relatively minor in comparison to the need to have the X chromosome fully represented in sequencing efforts and prefer the use of somatic DNA.

  6. Individuals whose DNA was used for library construction (with the exception of those created from deceased or anonymous individuals) should be fully informed about the risks and benefits described above, should freely choose whether they would like their DNA to continue to be used for this purpose, and their decision should be documented.

Top of page

April 27, 1998

Update of NHGRI Policy for the Use of Human Subjects in Large-Scale Sequencing

The NHGRI Policy for the Use of Human Subjects in Large-Scale Sequencing was initially developed two years ago. The policy was developed to ensure that the Federal regulations for the protection of human subjects in research (45CFR46) was being appropriately applied to the development of the first complete human DNA sequence as part of the Human Genome Project. Specifically, the NHGRI policy was intended to ensure that the donors had given appropriate informed consent for the use of their DNA in human DNA sequencing and that they remain, to the extent feasible, anonymous (recognizing that, ultimately, anonymity may be impossible as each individual's DNA sequence is a unique identifier). The purpose of this document is to describe updates and refinements to the policy that have been made since the policy was initially issued.

In Summary:

  1. Libraries that are constructed under the NHGRI Guidelines are categorized as "approved." The protocols under which these libraries are constructed, and the informed consent document, must be approved by the institutional IRB. The donors must give informed consent and all links to their identity must be destroyed. The donors should be informed that, ultimately, anonymity might not be possible because each individual's DNA sequence constitutes a unique identifier. NHGRI staff must review the IRB-approved protocol.

    The large-scale sequencing groups that use approved libraries must notify their local IRBs that they are using these libraries. NHGRI must receive documentation that approval has been given by the institution, or that their use has been declared to be exempt.

  2. Some libraries had been made prior to the development of the NHGRI Policy. In some cases, the donors for these libraries were known to the investigators, so they are not anonymous (although their identity is being kept confidential by the investigator and the institution). In some of those cases, the donors subsequently signed informed consent allowing the libraries to continue to be used, and the institutional IRB gave approval for the continued use. These libraries are categorized as "continued use" libraries.

    • Groups using "continued use" libraries for large-scale sequencing must receive approval from the local IRB, and NHGRI must receive documentation that the approval has been given.

    • "Continued use" libraries may be used for large-scale sequencing until June 30, 1999 if, at that time, a sufficient number of approved libraries is available to support the large-scale sequencing effort in the U.S. NHGRI will make that determination during the Spring of 1999. After that time, use of the "continued use" libraries must be terminated.

    • Termination of the use of a library means that, as of the termination date, no clones from that library may enter the "sequencing pipeline," i.e., be used to construct shotgun subclone libraries.

  3. Human genomic DNA libraries that are not "approved" or approved for "continued use" are categorized as "unapproved" libraries. The donors of these libraries did not give appropriate informed consent for their DNA to be used in large-scale sequencing and, in some cases, it may be possible to trace the donor's identity.

    • After June 20, 1998 these libraries cannot be used in large-scale sequencing projects for anything more than a minimal amount of sequencing, specifically no more than a total of 1 Mb of sequence over the total project period of the grants, i.e, 3 or 5 years.

    • Termination of the use of a library means that, as of the termination date, no clones from that library may enter the "sequencing pipeline," i.e., be used to construct shotgun subclone libraries.

  4. Each NHGRI grantee engaged in large-scale human DNA sequencing must provide a transition plan that details his plans to terminate use of unapproved or continued use libraries.

  5. Exceptions to this policy will be considered on a case-by-case basis. An important criterion in considering any exception will be the degree of risk that use of the library would pose to the donor.

Top of page

Revised Criteria and Process for Obtaining NHGRI Approval for:

  1. Continued Use of Existing DNA Libraries and Construction and Use of Existing DNA Libraries
  2. Construction and Use of New Libraries for Large-Scale Human DNA Sequencing
July 16, 1997


On August 17, 1996, the NHGRI-DOE "Guidance on Human Subjects Issues in Large-Scale Sequencing" was issued with the intention of advising investigators, Institutional Review Boards and others about those issues which are likely to arise in the construction and use of DNA libraries. Section 6 of the Guidance contains the following language:

In the interim, existing libraries can continue to be used for large-scale sequencing, only if IRB approval and consent for "continued use" are obtained and approval by the funding agency is granted.

On October 11, 1996, the NHGRI distributed a draft entitled "Criteria and Process for Obtaining NHGRI Approval for: (A) Continued use of existing DNA libraries and (B) Construction and use of new libraries for Large-Scale Human DNA Sequencing" The purpose of that document was to further clarify what was intended by the phrase "approval by the funding agency is required." The NHGRI has now revised and updated that document.

1. Definitions

1.1 In the draft document, the term "In the interim" was not intended to define a specific date beyond which it would be impermissible to continue using existing libraries. The term was chosen to recognize that while replacement of those libraries is desirable, at that time, other libraries did not yet exist which were preferable to the existing one. Subsequently, NHGRI supported the construction of new, more acceptable libraries, and the first of these are now close to being ready for distribution. Consequently, NHGRI has decided that the "interim period" is now coming to a close. Grantees engaged in large-scale DNA sequencing are expected to introduce these new libraries into their operations as soon as they have been shown to be capable of supporting large-scale sequencing. NHGRI continues to support the following concept, as originally expressed in the draft document. "As new libraries are developed according to the principles described in the Guidance, NHGRI expects that the new libraries will replace the existing libraries as rapidly as possible, consistent with the scientific strategy of each sequencing laboratory. Thus, NHGRI expects that existing libraries will have a continually diminishing role in the large-scale sequencing efforts supported by the agency. NHGRI intends to work with grantees to facilitate smooth and rapid transitions."

1.2 By "large-scale sequencing" we mean any NHGRI-supported human genomic sequencing grant where the amount of human DNA sequenced from any one library is at least one Mb of genomic sequence over the entire project period (competing segment).

2. Obtaining NHGRI Approval

Library makers and sequencers each have responsibilities with respect to the protection of human subjects. The responsibilities differ somewhat with respect to what the Guidance requires, as follows:

2.1 NHGRI-supported library makers who do not intend to withdraw an existing library from use.
A "request for continued use" must be submitted to NHGRI; this should include:

the steps that were taken to comply with the Guidance, including a description of the process that was used and a copy of the informed consent form that was used; documented approval from the local Institutional Review Board.

2.2 Library makers who intend to construct a new library.
A plan must be submitted to NHGRI; this plan should include documented approval from the local Institutional Review Board; and the plans for recruiting DNA donors, including a description of the process for obtaining informed consent and a copy of the informed consent form to be used.

2.3 DNA sequencers who intend to use an existing library.
Must inform NHGRI of their plan to use an existing library; the notice should describe plans for transition to the use of new libraries, as soon as the latter become available; and documentation that the local institution has agreed to the use of a library (or libraries) that has been approved for continued use by the IRB of the institution where the library was generated.

2.4 DNA Sequencers who intend to use a new library (which has been approved under the Guidance).
Must inform NHGRI of their plan to use a new library; the notice should describe documentation that the local institution has agreed to the use of a library (or libraries) that has been approved for use by the IRB of the institution where the library was generated.

3. NHGRI Action

Once NHGRI has been notified, a decision will be made in as timely a manner as possible. It is expected that notification will be provided within two weeks of the receipt of plans. In the case of 2.1-2.3, NHGRI will inform the applicants of its approval. In the case of 2.4, NHGRI will acknowledge the applicant's notification.

Top of page

Last Reviewed: March 9, 2012