What Information and Data Are Submitted?
All final datasets (human or non-human, including microbial data) generated through large-scale genomic projects, not just those datasets generated to support a publication, should be submitted to appropriate data repositories or made available through NHGRI-approved alternative data sharing plans. NHGRI finds value in and encourages the sharing of smaller project sizes that do not meet the definition of ‘large-scale’ according to the NIH guidance regarding the scope of the GDS Policy. Investigators should consult with appropriate NHGRI Program Directors as early as possible to determine whether the GDS Policy applies to their research study.
For more information on the scope of the NIH GDS Policy, see Section B of the NIH Genomic Data Sharing FAQs. For more information on how NHGRI’s expectations differ from the NIH expectations, see the NHGRI-Specific GDS Policy FAQs.
All metadata and descriptive information (e.g., protocols or methodologies used) needed to support future use of the data should be submitted. As much de-identified phenotype data as is practicable should be submitted. In this context, phenotype data refers to clinical data, environmental data, demographic variables, and any non-genotype data. When appropriate, relevant phenotype data from non-human studies also should be shared through open (unrestricted access) community resource data repositories.
Large resource projects (e.g., 1000 Genomes) should share their raw data (e.g., reads), intermediate data (e.g., assemblies), and processed data (e.g., variant calls, genotypes, haplotypes). When possible, investigators should use standard formats and vocabularies/ontologies to describe data elements (e.g., sequence data, variants, or phenotypes.
Data Sharing Plans
Resources regarding the expected elements of data sharing plans are provided on the NIH GDS Policy website, including information on how these plans are considered during peer review. After peer review, NHGRI will assess the potential value of the dataset for use in secondary analyses to confirm findings, explore different research questions, develop or refine analytic methodologies or programs, etc. In addition, funds and other resources requested for data deposition, management, or access will be considered.
For studies involving human data, NHGRI also will consider Institutional Review Board (IRB) assessments of informed consent processes and consent documents as noted in the NIH GDS Policy. An IRB should be consulted during the process of developing a Data Sharing Plan for studies generating data from human subjects. Participant protection issues for the proposed study population (e.g., particular privacy concerns or a potential for group harm) or related to the scientific design (e.g., isolated geographic population or small family studies), as evaluated by an IRB and consistent with program priorities, will also inform data sharing plan review.
The Institutional Certification asserts that that plans for the submission of human genomic data to the NIH meet the expectations of the NIH GDS Policy, and describes the Data Use Limitations (DULs) associated with the data set. DULs are developed by the submitting institution and are based on the terms of the informed consent of the study participants from whom the genomic data are being generated, or otherwise stipulated by the submitting institution.
As specified in the NIH GDS Policy, data submitting institutions (including the NHGRI Intramural Research Program) should submit to the pertinent NHGRI Program Director or NHGRI Genomic Program Administrator (GPA), as appropriate, an Institutional Certification document signed by an appropriate Institutional Signing Official, for studies that require this document.
For information about exceptions and alternatives to the NHGRI genomic data sharing expectations, see Exceptions and Alternatives.Genomic Summary Results Update
On November 1, 2019, NIH updated the management of Genomic Summary Results (GSR) to allow unrestricted access to GSR from most studies deposited in NIH-designated data repositories.
Genomic Data Sharing Plan Forms
NHGRI requires that investigators complete an NHGRI Genomic Data Sharing Plan (GDSP) form as part of their Just-in-Time information. Investigators should work with their Program Director and the NHGRI GPA to finalize and approve your Just-in-Time information.
Process for Submitting and Releasing Data
Clear milestones for the timing of data deposition should be established for each project and included in the Data Sharing Plan to provide a timeline by which to assess progress toward meeting data submission expectations. Milestones should adhere to standard data release timelines outlined in the NHGRI Genomic Data Sharing (GDS) Policy: Data Standards and the NHGRI Guidance for Data Submission and Data Release instructions below, and should be discussed with the relevant Program Director prior to the start of research projects. Large resource projects may develop project-specific timelines for data release, in conjunction with program officers or NHGRI intramural leadership, that exceed the minimum expectations specified in the NIH GDS Policy Supplemental Information and the NHGRI Guidance for Data Submission and Data Release instructions below.
Unless otherwise specified by funding opportunity announcements, analyses by submitting investigators that are conducted subsequent to the initial data submission, final data sets, or any data updates should be submitted for release concurrent with the first publication analyzing the dataset.
Data sharing progress reports will be expected, consistent with trans-NIH processes, as they are implemented, or through other NHGRI consortia reporting mechanisms, as applicable. Program directors will monitor progress against the timelines established through the data sharing plans.
NHGRI Guidance for Data Submission and Data Release
Investigators should note the following NHGRI data release expectation for non-human genomic data that differs from the NIH expectation. Data sharing plans for NHGRI-funded or -supported projects to generate non-human genomic data proposed after January 25, 2016 should include pre-publication timelines for data submission and release consistent with NIH GDS Policy expectations for human genomic data (including a possible holding period before data release not to exceed six months).
For more detailed information on NHGRI’s expectations for data sharing, see the NHGRI Genomic Data Sharing (GDS) Policy: Data Standards.
Last updated: August 27, 2019