Data Sharing Policies and Expectations
This webpage and the associated FAQs describe the various expectations for data sharing that are specific to NHGRI-supported studies. For general NIH data sharing policy information, please visit NIH's Scientific Data Sharing website.
Data Management and Sharing
Note: The NIH Data Management and Sharing (DMS) Policy (NOT-OD-21-013) goes into effect on January 25, 2023.
Broad data sharing promotes maximum public benefit from federally funded research, as well as rigor and reproducibility. For studies involving humans, responsible data sharing is important for maximizing the contributions of research participants and promoting trust. NHGRI supports the broadest appropriate data sharing with timely data release through widely accessible data repositories. These repositories may be open access (unrestricted) or, if more appropriate, controlled access. NHGRI is also committed to ensuring that publicly shared datasets are comprehensive and Findable, Accessible, Interoperable and Reusable (FAIR).
For information on NIH’s data management and sharing requirements, see their Data Management and Sharing webpage. For more on NHGRI’s expectations, see below.
Where to Submit Data
Note: More detailed guidance on data repositories for sharing data under the upcoming NIH DMS Policy is forthcoming.
The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) is a secure, cloud-based environment where researchers can store, share and analyze key unrestricted and controlled-access datasets and associated phenotypic data or metadata, particularly those generated with NHGRI funding or support (NOT-HG-19-024).
Upon implementation of the DMS Policy, controlled-access data generated with NHGRI funds should generally be shared through AnVIL. Large, multi-site (consortium) programs that generate open-access data resources for the broader scientific community should share via the mechanism specified in the Funding Opportunity Announcement, for example, through the project’s coordinating center.
How to Register Controlled-Access Studies
Note: After January 25, 2023, NHGRI's Genomic Data Sharing Policy (GDSP) templates will be replaced by a new document for capturing the information needed to register a controlled-access study. Signing official approval will no longer be needed. For application receipt dates on or after January 25, 2023, requests for an Alternative Data Sharing Plan should be documented in the DMS Plan.
Follow the process outlined in How to Register and Submit Your Study in dbGaP (Steps 1 – 6) to register your study in dbGaP.
Note: Study registration in dbGaP is required for large-scale human genomic studies submitting data to AnVIL and studies with an Alternative Data Sharing Plan.
For Step 2: Please complete the relevant template or send the basic study information needed for study registration to the NHGRI GPA (Jennifer.firstname.lastname@example.org):
Investigators seeking to submit non-NIH funded data to an NIH-designated data repository (e.g., AnVIL or dbGaP) should follow these instructions.
* Requires NIH Login
NHGRI follows the NIH’s expectation for submission and release of scientific data, with the following exception: for genomic data, NHGRI expects non-human genomic data that are subject to the NIH GDS Policy to be submitted and released on the same timeline as human genomic data.
- Level 0: Raw data generated directly from the instrument platform
- Level 1: Initial sequence reads, the most fundamental form of the data after the basic translation of raw input
- Level 2: Data after an initial round of analysis or computation to clean the data and assess basic quality measures
- Level 3: Analysis to identify genetic variants, gene expression patterns, or other features of the data set
- Level 4: Final analysis that relates the genomic data to phenotype or other biological states
Metadata and Phenotypic Data Sharing Expectations
Note: After the effective date of the NIH DMS Policy, investigators will outline plans for sharing of metadata, phenotypic data and other descriptive information in the DMS Plan rather than in the Resource Sharing Plan.
Per NOT-HG-21-022, NHGRI-funded and supported researchers are expected to:
- Share the metadata and phenotypic data associated with the study.
- Use standardized data collection protocols and survey instruments for capturing data, as appropriate.
- Use standardized notation for metadata (e.g., controlled vocabularies or ontologies) to enable the harmonization of datasets for secondary research analyses.
Investigators should outline plans for comprehensive sharing of metadata, phenotypic data and other descriptive information (e.g., protocols or methodologies used) in the Resource Sharing Plan of the grant application.
NHGRI strongly encourages the use of existing data standards and ontologies that are generally endorsed by the community of your research area, although it does not require the use of any particular one. Investigators should use data standard(s) and ontologies that facilitate comparison across similar studies within their research field.
For ideas of where to find a standard that aligns with your research domain, see the FAQ “Where should I start?”
Genomic Data Sharing
Note: After the effective date of the NIH DMS Policy, investigators will indicate an Alternative Data Sharing Plan or a request for an exception to the NHGRI expectation for explicit consent in their DMS Plan rather than in the Resource Sharing Plan.
For information on NIH’s genomic data sharing requirements, see the Genomic Data Sharing Policy webpage. For more on NHGRI’s expectations, see below.
Applicability of the Genomic Data Sharing Policy
The NIH Genomic Data Sharing (GDS) Policy (NOT-OD-14-124) and NHGRI’s implementation of the policy applies particularly to single nucleotide polymorphism (SNP) array data, genome sequence data, transcriptomic data, epigenomic data or other molecular data produced by array-based or high-throughput sequencing technologies.
NHGRI's Expectation Under the Policy
NHGRI values and encourages the sharing of smaller project sizes that do not meet the definition of “large-scale” according to the NIH guidance regarding scope of the GDS Policy. Investigators should consult with appropriate NHGRI program directors as early as possible to determine whether the GDS Policy applies to their research study. See the Notice of Plans for NHGRI Implementation of NIH Genomic Data Sharing Policy (NOT-HG-20-011 and NOT-HG-15-038) for more information about NHGRI’s expectations for genomic data sharing.
Informed Consent Requirements
Per the NIH GDS Policy, for studies that started after January 25, 2015 (the NIH GDS Policy effective date), informed consent documents for prospective data collection should state what data types will be shared (e.g., genomic, phenotype, health information) and for what purposes (e.g., general research use, disease-specific research use) and whether sharing will occur through open (unrestricted) or controlled-access databases.
Information that NIH expects to be conveyed in informed consent documents are defined in the NIH Guidance on Consent for Future Research Use and Broad Sharing of Human Genomic and Phenotypic Data Subject to the NIH Genomic Data Sharing Policy.
For more in-depth discussion of principles and best practices for drafting informed consent documents for genomics research, see the NHGRI Informed Consent Resource.
NHGRI's Expectations Under the Policy
- NHGRI strongly encourages studies that propose to derive genomic data from human specimens and cell lines to obtain participant consent either for general research use through controlled access or for unrestricted access. Similarly, consent language should avoid both restrictions on the types of users who may access the data and restrictions that add additional requirements to the access request process. NHGRI acknowledges that this will not always be possible or appropriate. In addition, individual participants who do not consent to future research use or broad sharing of their data (i.e., submission of their data to a publicly accessible data repository) may still participate in the primary study if consistent with study design.
- As of January 25, 2021, NHGRI expects that all human data generated by NHGRI-funded or supported research will be derived from biospecimens or cell lines for which explicit consent for future research use and broad data sharing can be documented. This NHGRI expectation goes beyond those of the NIH GDS Policy; the NIH GDS Policy does not require explicit consent for future use and broad data sharing when specimens or cell lines were created or collected before January 25, 2015, but NHGRI’s expectation contains no such clause. Research that proposes to use specimens and cell lines that lack explicit consent for future research use and broad data sharing should be accompanied by a request for an exception that describes the scientific reason(s) for using the specified data sources.
The need for an exception should be documented in the Resource Sharing Plan of the grant application. Requests for exceptions should be submitted via Part VI (“Request for an Exception for Samples Lacking Explicit Consent for Future Research Use and Broad Data Sharing”) of the NHGRI Genomic Data Sharing Plan (GDSP) template. (See the “How to Register Controlled-Access Studies” section).
For more information on the explicit consent requirement, visit our FAQ.
Alternatives to the GDS Expectations
When consistent with program priorities, NHGRI may accept well-justified data sharing plans that:
- Are unable to deposit genomic data in an NIH-designated data repository.
- Propose to share genomic data via a non-NIH-designated data repository.
A detailed explanation of the alternative mechanism for data sharing should be documented in the Resource Sharing Plan of the grant application and via the Alternative Data Sharing Plan Template. (See the “How to Register Controlled-Access Studies” section.)
NHGRI to require explicit consent for data sharing in genomics research
December 30, 2019
We need your input: dbGaP data submission and access process
February 21, 2017
The Genomics Landscape: The natural evolution of genomic data sharing
September 3, 2014
NIH issues finalized policy on genomic data sharing
August 27, 2014
NOT-HG-21-022: Notice Announcing the National Human Genome Research Institute’s Expectation for Sharing Quality Metadata and Phenotypic Data
NOT-HG-15-038: Notice of Plans for NHGRI Implementation of NIH Genomic Data Sharing Policy
NOT-HG-20-011: NHGRI Implementation of the NIH Genomic Data Sharing Policy
For Policy Questions:
Resources for Intramural Staff
NHGRI intramural staff should refer to specific instructions for NHGRI forms and submitting plans on the NHGRI Intranet's Genomic Data Sharing Policy Resources page (requires NIH login).
Last updated: November 2, 2022