Last updated: October 07, 2014
The Natural Evolution Of Genomic Data Sharing
September 3, 2014
Rapid and broad data sharing has been a hallmark of genomics since the early days of the Human Genome Project (HGP). Today, it is well-appreciated in genomics that the work of individual investigators and large collaborative efforts alike benefits from access to data resources such as ENCODE, 1000 Genomes, and The Cancer Genome Atlas (TCGA). Furthermore, the cumulative benefit realized through the culture of genomic data sharing transcends individual projects, and has been essential to accelerating genomics research across the board.
On August 26, NIH released its new Genomic Data Sharing (GDS) Policy, updating a previous policy that focused on genome-wide association study data. The new policy promotes sharing a broader array of large-scale human and non-human genomic data generated for appropriate research purposes when those data are generated through NIH-supported or NIH-conducted research. This new policy represents another step the agency is taking to encourage the culture of data sharing and to extend such data-sharing practices beyond large research projects (e.g., HGP and 1000 Genomes).
The GDS Policy builds upon an existing data sharing framework, ensuring responsible and respectful research participant protections and promoting appropriate access to genomic and associated data. In a recent Nature Genetics article entitled "Data Use under the NIH GWAS Data Sharing Policy and Future Directions", NIH leaders detail the experience to date of the two-tiered access structure established by the Genome-Wide Association Studies (GWAS) Policy and implemented through the database of Genotypes and Phenotypes (dbGaP). Under the GWAS policy, more than 2,200 investigators from 41 different countries have received access to dbGaP data from 304 studies and produced more than 900 publications and significant advances. The findings highlight that access to such data provides not just an opportunity to accelerate research by virtue of combining large and information-rich datasets (or just enabling additional research questions to be addressed), but also the potential to maximize the public benefit achieved through this increased capacity.
Of course, sharing data generated about human research participants must be done in a manner that appropriately protects participant interests and respects the participant agreements for sharing data and health information. Like the GWAS Policy before it, the GDS policy aims to maximize scientific advances and potential public benefit in a manner consistent with research participant informed consent and the provision of appropriate considerations for participant privacy risks. While the new data types covered by the GDS Policy are different from the relatively narrow specifications for GWAS data previously covered, the ethical and scientific issues raised are very similar, and the appropriate principles that should govern responsible data use of various genomic data types are not substantially distinct. Consideration for the broad array of study designs, study populations, and potential consent provisions attached to genomic datasets remains integral to the GDS Policy, as does the agency's commitment to maintain the Policy's data-sharing expectations and participant-protection mechanisms.
The ethical use of genomic data and the public's trust in the systems used to govern that use continue to be of paramount concern for NHGRI and NIH. As such, the Institute, the agency, and I, personally, remain committed to the ongoing development and improvement of our data sharing policies and their oversight, so that biomedical (and genomic) research data are maximally utilized to promote public benefit. For further information on the GDS Policy, visit nih.gov/news/health/aug2014/od-27.htm.