National Institutes of Health U.S. Department of Health and Human Services
Researchers explore genomic data privacy and risk
By Ian L. Marpuri
NHGRI Scientific Program Analyst
Nothing is more unique to a person than his or her own DNA, the molecule that codes for how we look and who we are. Genomic researchers routinely analyze anonymous DNA samples to learn more about disease and health. But what if someone could identify you from your DNA? Would you still be willing to volunteer for genomic research?
Two prominent genomics researchers, Isaac Kohane, M.D., Ph.D., and George Church, Ph.D., probed issues related to genomic data privacy and risk at a March 21 lecture organized by the National Human Genome Research Institute (NHGRI) to celebrate the 10th anniversary of the completion of the Human Genome Project. Dr. Kohane is director of the Children's Hospital Informatics Program at Children's Hospital, Boston, and the Henderson Professor of Pediatrics and Health Sciences and Technology at Harvard Medical School, and Dr. Church is a professor of genetics at Harvard Medical School and the founder of the Personal Genome Project.
Before the 2008 passage of the Genetic Information Nondiscrimination Act (GINA), people worried that companies would deny them health insurance or employment based on their genetic predisposition to certain conditions. While GINA exists to prevent this type of discrimination, it hasn't stopped leakage of genomic data in a few cases, Dr. Church noted.
For example, researchers recently reidentified around 50 individuals in the 1000 Genomes Project using only publicly available genomic information in the 1000 Genomes Project database and other free Internet resources. The 1000 Genomes Project is an NHGRI initiative that seeks to record genetic variation among different ethnic groups. In response, the NIH removed the ages of 1000 Genomes' participants from all websites and published a policy piece on reidentification in Science.
Privacy breaches aren't a new development or exclusively related to genomic information, said Dr. Kohane. Studies in the 1990s showed that it is possible to identify an individual's health records using publicly available data like voter registration records, zip codes and physical descriptions. Dr. Church described other ways that data privacy has been breached such as the hacking of 26 million veterans' medical records and social security numbers from a laptop in 2006, or the reidentification of an anonymous sperm donor by his child using the child's own genetic data.
"If there's enough data out there, someone can mash it up and use it to identify anybody," said Dr. Kohane. "We have to realize that there's no such thing as perfect anonymity. However, just because the data is out there, doesn't mean that it's an invitation to breach your privacy. That's bad form."
Both called on researchers to discuss with study participants risks associated with participating in genomic research. Researchers need to make realistic claims about data privacy to patients while protecting their work and abiding by some set of operating guidelines, Dr. Kohane said.
In addition, many consent forms that people sign when they enroll in studies often do not clearly convey the possibility that someone could be reidentified from their genetic data, Dr. Church said. The 1000 Genomes Project consent form, for example, lists "information on how it would be very easy [to reidentify your genetic information], but they also say it would be very hard. This is a mixed message," he said.
Dr. Church requires participants in his Personal Genome Project (PGP) to score 100 percent on an exam about the consent form in order to enroll in the study. PGP is an international initiative that is currently the world's only open-access, genomic data set. In addition, participants consent to be reidentified although patient names are not listed with samples. Researchers urge participants to "pretend their name is out there and even consider putting their name out there on their own," Dr. Church said. PGP researchers hope to enroll 100,000 volunteers from each of four different countries.
As genomic data is integrated with other data such as phenotypes and environmental exposures, databases will provide a clearer picture of the genomic basis of disease and become more valuable, Dr. Church said.
It's just like Google Maps adding restaurant information to basic neighborhood maps to enhance their usefulness, Dr. Kohane added.
In the not-too-distant future, a person's genomic information will play a bigger part in their healthcare decisions. Genomics is already used to diagnose unknown conditions, find specific genetic variants associated with disease and enable the creation of gene therapies. As this happens, the distinction between a research study and clinical care is beginning to blur, Dr. Kohane said.
"Why have patients and researchers enter into a compact of mutual ignorance where the doctor agrees not to find out the patient's identity and the patient agrees not to find out what the researcher learned about them from the study?" Dr. Kohane asked. Researchers should be able to accommodate patients' requests to learn about their genome, he said.
They both agreed that maintaining a high level of privacy for people's genomes is crucial to integrating genomics into healthcare settings.