From The NIH Catalyst, September-October 2007
NISC: Tracing the Sequence of Events Leading to 21st Century Genomics
Ten years ago, the NIH Intramural Sequencing Center (NISC) began humbly with six gel-based sequencing machines crowded into a few rooms of borrowed lab space.
Today, NISC is not only one of the jewels of the NIH intramural program but also a highly productive sequencing center at the forefront of contemporary worldwide genomics research. "NISC now plays a key role in the ongoing NIH effort to unravel the genetic complexities of human health and disease," said NHGRI Director Francis Collins. "It has served as a vital component of numerous important sequencing projects over the past 10 years, all aimed at achieving an understanding of the human genome."
NISC researchers have also played a key role in large-scale collaborative projects, such as The ENCyclopedia Of DNA Elements (ENCODE).
A focus of NISC's current research portfolio is comparative sequencing-sequencing and studying the genomes of other vertebrate species. The ENCODE project has capitalized on this expertise; NISC has to date been credited with sequencing a targeted 1 percent of the genomes of 26 mammals, from the hedgehog to the elephant.
NISC's original mandate in 1997 was to provide NIH intramural investigator swith access to large-scale DNA sequencing and sequence analysis.
Start-up funds from the intramural programs of 14 NIH institutes and centers enabled NISC to renovate space and procure equipment for performing large sequencing projects (then defined as a modest 500 DNA sequence reads) on a fee-for-service basis. All 14 have availed themselves of NISC services over the years.
For example, notes NIDCD Director James Battey, NISC staff was instrumental in helping intramural scientists in the NIDCD Laboratory of Molecular Genetics identify several genes whose mutations result in hereditary hearing impairment, as well as two of the first mammalian taste-receptor subunits.
But the center has also evolved beyond its original vision, conducting larger-scale projects that, in turn, enabled its expansion. NISC's current capacity is approximately 6.6 million sequence reads per year, and growing.
Starting in temporary space provided by NIDCD at 5 Research Court in Rockville, NISC has since moved twice as it underwent major growth spurts. It now resides on the top floor of the NIH-rented building at 5625 Fishers Lane in Rockville.
A spacious and bustling facility, with a laboratory and computational staff of between 30 and 40 people, NISC produces DNA sequence data 24 hours a day, seven days a week. The sequencing center also is frequently toured by government and classroom contingents in search of a connection between advancing technology and biological research.
"NISC provides a valuable focal point for genomic education and outreach at NIH," observes Dr. Eric Green, NHGRI scientific director and NISC director. "We regard such outreach as part of NISC's mission."
NISC, says Green, "packs a lot of power in a midsized punch." In addition to working with NIH investigators, NISC collaborates with other leading genomics programs in this country and abroad, including the sequencing centers at the Eli and Edythe L. Broad Institute of the Massachusetts Institute of Technology and Harvard University in Cambridge, Mass.; Baylor College of Medicine in Houston; Washington University in St. Louis, Mo.; and the Wellcome Trust Sanger Institute in Cambridge, UK.
Since Green arrived at NHGRI from Washington University in 1994, his research program has focused on mapping, sequencing, and interpreting vertebrate genomes.
He brought one of the first comparative sequencing projects to NISC-sequencing and studying the region encompassing the cystic fibrosis gene in the mouse genome. The center's participation in three subsequent projects-sequencing the mouse genome, the Cancer Genome Anatomy Project, and the Mammalian Gene Collection Program-catalyzed a major NISC expansion.
"The net effect," Green said, "was the establishment of an efficient genome-sequencing program that produced extremely high-quality sequence data. Much of NISC's success can be attributed to a staff who routinely implements appropriate approaches and methodologies, in all cases tailored to important scientific projects."
After arriving at NHGRI as a postdoctoral fellow in 1994, Gerard Bouffard, director of the NISC Bioinformatics Group and associate investigator in the NHGRI Genome Technology Branch, witnessed the materialization of Green's concept of a high-throughput DNA sequencing facility at NIH.
"There were DNA-sequencing instruments here and there at different institutes, some of which were heavily overused while others didn't see much use," Bouffard recalled. "Rather than create a traditional sequencing core where someone could drop off any small-scale project, the NISC concept proposed to focus on projects that required the analysis of large numbers of samples."
Under Bouffard's leadership, the NISC Bioinformatics Group grew from an initial staff of two in 1997 to its current 11 members. Bioinformatics relies on computing capabilities, and NISC started with two disc arrays and combined disc space of 200 gigabytes. "In 1997, we thought that was a fantastic amount of disc storage," Bouffard said. "Nowadays you could almost put that in your pocket."
Bouffard recalls that in the early days of NISC, it would take a technician almost as much time to type in the sample names by hand as it did to pour a gel used for sequencing. His group established a naming scheme and a two-letter code system that they thought would last a long time. "As careful as we were to establish this system, the volume of sequencing projects tackled by NISC far exceeded our initial imagination," Bouffard said, referring to the present four-letter code system.
All NISC data are generated by the Sequencing Group, headed by Robert Blakesley, who is also an associate investigator in the NHGRI Genome Technology Branch. Blakesley joined NISC in 2000 after a 20-year career in the biotechnology industry. Some years before, he performed manual DNA sequencing in which gels were put against X-ray film, exposed for a day, and then developed to expose dark bands on the film.
"The great advance at NISC," Blakelsey said, "was that fluorescence was detected by an instrument in an automated fashion using software that converted the primary data to sequence. This eliminated the need for radioactivity and manual interpretation of bands on X-ray film."
The most senior technician in the Sequencing Group, Jyoti Gupta, arrived at NISC in 1998, immediately after her senior year as an undergraduate student studying biology. On the last day of classes in 1998, she received e-mail offering her a postbaccalaureate Intramural Research Training Award (IRTA) at NISC. Gupta was then one of just three members of the NISC Sequencing Group; now she manages a team of seven technicians.
Gupta reflected with some amusement on the early days at NISC. The facility then consisted of three rooms, including a converted storage closet that housed six Applied Biosystems model 377 DNA-sequencing machines. These instruments were gel-based, requiring significant time and care to prepare the gel and to manually track the samples on a computer following the analysis. It was easier to track the sequence data on the computer monitor in the dark, so she and her colleagues used a low-tech cardboard box placed over their heads to help track the samples.
In 2000, capillary-based DNA sequencing systems came on the scene, eliminating the need for gels and the burden of tracking samples on the computer. With these new instruments came a major growth in personnel.
Together, these changes led to a massive increase in the overall sequencing capacity at NISC, allowing the center to become part of the consortium of large-scale sequencing centers and to adopt a focus on comparative sequencing.
The Past is Prologue
"NISC's leadership could see the future growth of large-scale sequencing, especially for generating data that would help us interpret the human genome sequence," Blakesley said. "We staked out a claim that NISC was going to be a comparative sequencing shop."
In fact, according to Blakesley, the initial sequences from various species generated by NISC provided helpful preliminary data that facilitated decisions regarding which other species' genomes to sequence.
Through the experience gained in sequencing DNA from more than 60 vertebrate species, NISC has established particular expertise in sequence finishing-a specialized activity that requires refinement and perfection of the sequence data.
Sequence finishing can be expensive, but it is essential because it ensures accurate sequence analysis. Numerous problems are routinely encountered during sequence finishing, including establishing whether detected sequence changes are due to evolution or reflect errors in the generated data. This keeps Gupta's group quite busy.
New software tools being developed by NHGRI tenure-track investigator Elliott Margulies, among others, are being used to extract information from the comparative sequence data generated at NISC.
Such analytical methods are integral to projects like ENCODE, which generated trenchant insights from the intense scrutiny of only 1 percent of the human genome. In fact, it was NISC's early forays into comparative sequencing that demonstrated the value of zeroing in on a limited portion of the human genome en route to establishing how to interpret the human genome sequence more broadly.
Over the past 10 years, NISC has cultivated a network of more than 100 national and international scientific collaborators. These collaborations have engaged NISC scientists in a broad array of research projects, ranging from human disease studies to the examination of chromosome structure and evolution. These efforts have led to the publication of more than 60 papers that include NISC staff members as co-authors.
Currently, in collaboration with investigators around the world, NISC is expanding its capabilities in the area of medical sequencing-that is, sequencing the DNA of patients as part of clinical research projects.
At the Bethesda campus, NISC is partnering with investigators across NIH in a recently launched clinical research project called ClinSeq, which will use cutting-edge high-throughput sequencing to identify genetic variants associated with clinical phenotypes, with an initial focus on cardiovascular disease. ClinSeq aims to deploy large-scale sequencing in a clinical research setting and will allow NISC to become increasingly involved in clinical research efforts at NIH.
Last Reviewed: March 27, 2012