NHGRI logo

Researchers Analyze First Complete DNA Sequence Generated
at the National Institutes of Health

November 2009
NHGRI researchers gathered for a data annotation jamboree for the first ClinSeqTM whole genome sequence are, from left, David Ng, M.D., staff clinician; Jim Mullikin, Ph.D., associate investigator; Les Biesecker, M.D., senior investigator and chief of the Genetic Disease Research Branch; Tyra Wolfsberg, Ph.D., associate investigator; and Mr. Pedro Cruz, senior bioinformatics specialist.
A group of more than a dozen gene hunters from the National Human Genome Research Institute (NHGRI) recently gathered at the National Institutes of Health (NIH) Intramural Sequencing Center (NISC) in Rockville, Md. to analyze data from the first complete DNA sequence of an NIH Clinical Center patient. The volunteer patient is enrolled in the ClinSeqTM study, a trans-NIH effort to understand the genetic roots of heart disease and the challenges of using genome sequencing tools for personalized health benefit in a clinical research setting.

The ClinSeqTM study, begun in 2007, enrolls patients who include a group that show the warning signs and symptoms of heart disease. Researchers use large-scale medical sequencing to gather detailed data about each participant's genetic makeup, and then analyze that data to see how individual genetic variations relate to the risk of heart disease. The study aims to develop an infrastructure for the generation and use of large-scale medical sequencing in a clinical research setting.

"This accomplishment represents a major milestone for ClinSeqTM, as well as for the entire NIH Intramural Program," said NHGRI Director Eric Green, M.D., Ph.D. "In order for genome sequencing tools to move from the lab to the clinic, we need to develop new and better strategies to analyze sequence data to effectively guide patient care."

The ClinSeqTM volunteer provided a blood sample that was sequenced at NISC using the Illumina next-generation sequencing system. The process began in June and the finished sequence was delivered early in November. The Illumina system sequences both strands of diploid human DNA and achieves 50-fold coverage; by comparison, the human genome reference sequence represents a haploid genome and a modest 8-10 fold coverage. The repeated coverage of the sequence improves the confidence in the data.

Researchers initially selected about 300 disease candidate genes to be analyzed in the patient's genome for association with heart disease. The clinical component of the ClinSeqTM team also evaluated the patients' heart health with standard tests and scans.

ClinSeqTM principal investigator Les Biesecker enters a database annotation to describe a genomic variation in the first whole genome sequenced as part of a clinical research protocol at NIH.
"We chose this patient for this pioneering effort because of his remarkable family history of heart disease," said ClinSeqTM Principal Investigator Les Biesecker, M.D., chief of NHGRI's Genetic Disease Research Branch. "Many of his family members have suffered early onset heart disease. Yet, they do not have any known disorder that can cause this and so the genetic cause of this problem remains a mystery."

The patient's genome sequence contains 3.5 million variants, or mutations. However, only a small number of those may have a harmful effect on the patient's health. According to Dr. Biesecker, during this one-day jamboree, the group was likely to identify 15 to 30 sequence variations from the candidate list of genes.

Following a two-hour orientation to the data and the database tools used in its analysis, Dr. Biesecker began divvying out assignments for the gene hunt to NHGRI bioinformaticians and human molecular geneticists in attendance. "Who wants chromosome 5?" Hands shot up around the room and the gene hunters took to their task in earnest. "Can I do all the copy number variants?" asked Nancy Hansen, a member of the NISC bioinformatics team, referring to a specific type of variant, or mutation, involving duplicate copies of bases. Other forms of human genome variations include deletions and insertion of DNA sequences, as well as DNA sequences appearing in a different order than expected.

Preliminary analysis of the patient's symptoms and those of family members pointed to four regions in the genome sequence of the patient that could be explored to specifically address susceptibility to heart disease.

Jim Mullikin, Ph.D., NHGRI associate investigator, is head of the Comparative Genomics Unit at NISC, which for the past 12 years has provided NIH investigators with access to large-scale DNA sequencing. "We are usually looking at a few mutations across the DNA sequence of a lot of individuals," Mullikin explained. "This time, we're looking at lots of mutations across one individual. There's a whole different element to this problem."

Clesson Turner, M.D., collaborating geneticist from Walter Reed Medical Center, who recently completed a clinical and molecular genetics fellowship in NHGRI's Medical Genetics Branch and and is a member of the ClinSeqTMstudy team said, "We know a lot about this pedigree," referring to the family's health history. "Now that we have the patient's genome, we can learn a lot more."

About a half hour into the hunt, Dr. Biesecker drew the group's attention to the large projection screen at the head of the room. "We may have a finding," he declared. "I think we've found our first clinical mutation."The group followed along as Dr. Biesecker located an annotation in a database for the mutation conveying a carrier risk for a rare recessive disorder.

The finding is one that relates to a clinical symptom that is not directly related to heart disease, but it is nonetheless a trait that either the patient or family members may want to know.

By the end of the day, Biesecker was excited about the initial findings. "Knowing what we learned today from one patient's genome, I am optimistic about what we can ultimately learn about heart disease as we analyze additional patient genomes," he said.

Last updated: November 15, 2012