25th anniversary of the NIH Intramural Sequencing Center (NISC)
Last week, we started a new fiscal year for the federal government. As is often the case, we began Fiscal Year 2023 under a “Continuing Resolution” (CR), which means the government is currently operating based on last year’s budget — awaiting passage of a new budget for the current fiscal year. The CR allows NHGRI to continue functioning and performing most of its normal work, although we eagerly await a new budget for Fiscal Year 2023.
In September, President Biden announced his intention to appoint Renee Wegryzn, Ph.D., as the inaugural director of the Advanced Research Projects Agency for Health (called ARPA-H). ARPA-H aims to conduct high-impact biomedical and health research with a mission “to benefit the health of all Americans by catalyzing health breakthroughs that cannot readily be accomplished through traditional research or commercial activity.” NHGRI looks forward to working with Renee and the growing ARPA-H staff in the coming months.
For those readers going to the annual American Society of Human Genetics’ annual meeting later this month, I would like to draw your attention to two ancillary workshops at this year’s meeting being hosted by NHGRI staff. The first, entitled “Applying for NIH Grants: Strategies for Success,” will provide an inside look at the NIH funding process as well as provide tips and tools for successful grant writing; this session will take place on October 26 and will have an option for remote participation. The second, entitled “NHGRI Building a Diverse Workforce – Listening to the Voices of Trainees and Early-Stage Scientists,” is designed for trainees and early-stage researchers and will feature discussions about professional opportunities in genomics; this session will take place on October 27.
All the best,
25th anniversary of the NIH Intramural Sequencing Center (NISC)
As the sun came up on 1997, the Human Genome Project (then in its seventh of 13 years) was adding megabases of genome sequence to public databases weekly, as large-scale DNA sequencing had advanced substantially. At that time, it became clear that having the local capacity to generate large amounts of DNA sequence data in support of genomics research projects would be highly beneficial to any major biomedical research enterprise, yet no such capability existed within the NIH Intramural Research Program. Recognizing this acute need, the recently appointed chief of NHGRI’s intramural Genome Technology Branch, Eric Green, proposed the idea of establishing a DNA sequencing center at NIH to Francis Collins and Jeff Trent (who at the time were NHGRI’s director and scientific director, respectively). Aided by the enthusiastic support and encouragement of the then-NIH director, Harold Varmus, the intramural research programs of 13 other NIH institutes and centers joined forces with NHGRI to establish the NIH Intramural Sequencing Center (NISC) in the fall of 1997. . . and the rest is history!
Under the leadership of Eric Green, M.D., Ph.D., NISC director from 1997 to 2009, and Jim Mullikin, Ph.D., NISC director from 2009 to the present, NISC became a well-respected and highly used large-scale DNA sequencing center that exemplifies the evolution of the field of genomics and the ever-increasing abilities to produce prodigious amounts of genomic data. NISC has experienced radical shifts in DNA sequencing technologies, research approaches, and data analysis methods, all while consistently emphasizing a strong collaborative spirit in their work. That spirit has brought them into dozens of major research projects with international consortia and NIH intramural researchers alike.
NISC aims to advance genome sequencing and its many applications, with a focus that goes beyond just generating DNA sequence data to promote the value of bringing genomics to biology and medicine. From the beginning, NISC emphasized the generation of the highest data quality possible and the central importance of working closely with their collaborators. As the years marched on and technological advances emerged, NISC continually expanded its experimental repertoire, with a partial list of methods including whole-exome sequencing, whole-genome sequencing, RNA sequencing, custom capture DNA sequencing, and CHiP-seq, among others.
NISC moved into their first research space in 1997 and began Sanger-based fluorescent DNA-sequencing using six slab gel-based ABI-377 DNA sequencing machines. Except for two months in 2020 due to the COVID-19 pandemic, NISC has continuously produced high-quality DNA sequence since its inception — albeit with a regularly changing set of DNA sequencing machines. Their work also involved two subsequent moves to new locations, with each move required to allow for NISC to grow appropriately. While research projects and technology platforms have changed over the years, NISCs DNA-sequencing generating pipeline has adapted accordingly and is all operated by a staff that has been remarkably stable in composition over aquarter of a century.
In 1999, NISC expanded its DNA-sequencing pipeline to contribute to the sequencing of the mouse genome. This included developing expertise in so-called sequence “finishing,” a highly manual and customized process that is required for producing high-quality and complete genome sequence. Around that time, NISC also participated in the Mammalian Gene Collection (MGC), a program that generated high-quality sequences of a reference set of cDNA clones representing human and mouse genes, which has proven to be an important resource for the research community.
NISC became deeply involved in comparative genomics through its Comparative Vertebrate Sequencing Project and NHGRI’s Encyclopedia of DNA Elements (ENCODE) Project. In the case of ENCODE, NISC’s genome-sequence data from nearly three dozen vertebrate species contributed to larger datasets that provided new insights about non-coding functional elements in the human genome.
When significant interest in human genome sequencing for the study of health and disease first emerged (once referred to as “medical sequencing”), NISC established a pipeline for generating human genome sequence data for identifying genomic variants of medical relevance. This included the importation of the powerful new “next-generation” DNA sequencing technologies as they matured in the decade following completion of the Human Genome Project: first Roche/454 pyrosequencing and later Illumina’s sequencing-by-synthesis. The unprecedented increase in the ability to generate DNA sequence data then allowed NISC to expands its experimental horizons, such as its involvement in major microbiome studies.
More recently, NISC stepped up during the COVID-19 pandemic to sequence SARS-CoV-2 genomes and immune-response transcriptomes collected as part of several NIH research projects. The group also produced 144 billion bases of ultra-long read DNA sequence data and conducted preliminary data analysis for the Telomere-to-Telomere (T2T) consortium, part of a broader effort that produced the first truly complete sequence of a human genome. Alongside all these developments, NISC remains dedicated to its fundamental goal of helping to support the research of NIH intramural investigators. Through these interactions, they continue to regularly help solve riddles that advance biomedicine.
Over its 25 years, NISC has been associated with some impressive statistics. The past and present roster of the group’s dedicated staff includes almost 150 people, with nearly all the current staff having been at NISC for 10 years or more. NISC once had as many as 50 staff members, but now operates with 25. The number of DNA-sequencing machines has ranged from 6 to 21, with 17 different types of instruments being used at one time or another. Years ago, NISC had one modest-sized compute server. Now, it has a high-performance cluster with over 2,500 nodes and 5 petabytes of local data storage. In total, NISC has worked with 20 NIH institutes and centers and 195 individual intramural investigators. Most impressively, NISC now produces 50 trillion bases of DNA sequence (on average) per month. This one-month total is greater than all the generated DNA sequence in its first 15 years of operation!
NISC’s future remains bright, as new technologies continue to emerge, with the group regularly incorporating them into their pipelines. This includes the increasing adoption of so-called long-read DNA-sequence technologies, such as PacBio and Oxford Nanopore. The line of NIH intramural investigators waiting at NISC’s door seeking genomics collaborations is consistently impressive. NISC’s dedication, scientific achievements, and collaborative spirit have brought them successes and accolades throughout its 25 years, and there is every expectation that this will continue in their next 25 years.
Historically Speaking series to feature African American genetics researchers
The Smithsonian’s National Museum of African American History and Culture will host the second part of a four-part Historically Speaking collaboration with NHGRI on October 20, 2022, at 7 p.m. ET. The event will be held in person in the Oprah Winfrey Theater and broadcast live for online viewers. Akilah Johnson of the Washington Post will moderate a panel with senior researchers affiliated with NIH. The researchers will discuss why they chose a career in medicine, recount their experiences with mentors, discuss the barriers they overcame in their career, and share how they promote more diversity in the field. NHGRI intramural investigator, Neil Hanchard, M.B.B.S., D.Phil., will be among the panelists. The event is made possible by contributions from the Foundation for the National Institutes of Health and support from NHGRI. Attendance and viewing of the event are free, but registration is required.
Bridge2AI program expands use of artificial intelligence in biomedical research
NIH has awarded $130 million in grants over four years through the NIH Common Fund’s Bridge to Artificial Intelligence (Bridge2AI) program to accelerate the widespread use of artificial intelligence (AI) by the biomedical and behavioral research communities. The goals of the program are to generate flagship AI-ready datasets centered around biomedical grand challenge questions, to emphasize ethical AI best practices for biomedicine, and to promote diverse teams who span the boundary separating AI and healthcare research. Bridge2AI is funding four data-generation projects that will focus on AI analyses of human voice recordings as a biomarker for health; intensive care unit data to build predictive models for adverse health events; spatiotemporal architecture of human cells and their use in interpretable genotype-phenotype learning; and Type 2 diabetes as a model to show how a person’s health is restored after disease. Bridge2AI will also fund a BRIDGE (or coordinating) Center for the program that will be comprised of six cores across three U54 awards.
New NHGRI program aims to systematically study human gene function
NHGRI has launched a new program called Molecular Phenotypes of Null Alleles in Cells (MorPhiC), which will systematically catalog human gene function. The program comprises five awards totaling $42.5 million over five years, with the goal to create null alleles for 1,000 protein-coding genes and characterize the phenotypic impact of loss of gene function at a molecular and cellular level. If successful, the program will inform future efforts to catalog the function of more human genes. Currently, over 6,000 out of the 19,000 protein-coding genes in the human genome have not been well-studied. The MorPhiC program seeks to explore methods and develop approaches for investigating gene function. Data from the program will be made available to the broader research community.
New educational materials aim to increase knowledge about sickle cell disease gene therapy
As part of National Sickle Cell Awareness Month, NHGRI recently released educational guides about gene therapies that are available for sickle cell disease. These were written for people living with sickle cell disease, their families, and other members of their support teams. The Health Disparities Unit in NHGRI’s intramural Social and Behavioral Research Branch, in partnership with the Democratizing Education for Sickle Cell Disease Gene Therapy Project, developed fact sheets that cover the clinical trial process and eligibility for sickle cell disease gene therapy; the types of sickle cell disease gene therapy; and gene therapy participation and mental health. Sickle cell disease is the most common inherited blood disorder in the United States. Approximately 100,000 Americans have the disease. In the United States, sickle cell disease is most prevalent among African Americans. About one in 12 African Americans carry the sickle cell trait. Hispanic Americans are also at higher risk.
NHGRI welcomes new NIH-ACMG and NHGRI-ASHG fellows
This fall, NHGRI welcomes four new fellows: two as part of the NIH-American College of Medical Genetics and Genomics (ACMG) Fellowship Program and two as part of the NHGRI-American Society of Human Genetics (ASHG) Fellowship Program. The NIH-ACMG fellowship seeks to increase the pool of health practitioners in managing research and implementation programs in genomic medicine. Julius Militante, Ph.D., R.N., and Veronica Abraham, M.D., M.P.H., M.Sc., are the fellows for 2022-2024. The NHGRI-ASHG Genetics & Public Policy Fellowship provides individuals an opportunity to work in NHGRI’s Policy and Program Analysis Branch, at ASHG, and in Congress; this year’s fellow is Albert Hinman, Ph.D. The NHGRI-ASHG Genetics & Education Fellowship program provides fellows an opportunity to work in NHGRI’s Education and Community Involvement Branch and at ASHG in developing educational programs for a wide range of audiences; this year’s fellow is Nancy Sey, Ph.D.
HL7 Fast Healthcare Interoperability Resources (FHIR) is a clinical data standard and an application programming interface to exchange Electronic Health Records (EHRs). FHIR provides an opportunity for the genomic medicine research community to develop ways to consistently report genomics and genomics-related data among different healthcare systems. However, additional developments are needed to better understand how these data should be represented in FHIR in order to best serve both the clinical and basic research communities. In 2019, NIH issued a notice encouraging researchers to explore the use of FHIR to capture, integrate, and exchange clinical data for research purposes and to enhance capabilities to share research data. Did you know that NHGRI-supported programs are involved in improving FHIR’s ability to exchange genomics and genomics-related information among discrete healthcare systems? For example, the Electronic Medical Records and Genomics (eMERGE) Network has been involved in expanding the capabilities of FHIR to represent clinical genomics results, and, more recently, FHIR is being incorporated into NHGRI resource programs such as the Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL).
Efforts to standardize genomic medicine research are not finished. In February 2021, the NHGRI Genomic Medicine Working Group hosted a virtual meeting titled “Genomic Medicine XIII: Developing a Clinical Genomic Informatics Research Agenda.” The meeting called out “identifying and addressing semantic and syntactic gaps related to the representation of genomic information in existing clinical data standards and models” as a short-term need for the clinical informatics field. Over the coming years, NHGRI will be focusing on this issue, so as to pave the way for genomics to be responsibly incorporated into the clinical setting.
NIH and NHGRI News
About The Genomics Landscape
A monthly update from the NHGRI Director on activities and accomplishments from the institute and the field of genomics.
Last updated: October 6, 2022