Large-Scale Genome Sequencing and Analysis Centers (LSAC)
Purpose and ScopeThe scope and purpose of the LSAC component of the NHGRI Genome Sequencing Program is to provide large-scale genome sequence datasets in pursuit of multiple long-term goals of high significance to a broad range of the biomedical research community. These include identifying somatic mutations associated with cancer, characterizing variation underlying complex disease, pursuing questions about basic genomic variation and how it relates to biology and disease, exploring basic questions in comparative and evolutionary genomics through sequencing many organismal genomes, adding value to model organism research by providing reference genome sequences, and other areas. These scientific program aims are detailed in RFA-HG-10-015.
It is important to note that the LSAC program provides much more than large-scale production capacity. The program grantees have the flexibility to rapidly explore and adopt new methods and strategies that arise out of technology platform changes, or that are required to investigate a specific genomics sequencing project at scale; for example, the LSAC optimized targeted and exome sequencing at scale in order to pursue project types that have now become routine. The program also develops and propagates improvements in sequencing and analysis pipelines, and develops standards and best practices. This is facilitated by the program organization into a research network with other GSP components, providing a structure for coordination within and between programs.
The best known example of the methodological benefits of scale within the program, and its promulgation throughout the biomedical community, is the decrease in sequencing costs: DNA Sequencing Costs: Data from the LSACs. This has several dimensions, including developing a common framework for assessing and communicating about costs, understanding the relationships between cost and quality, and substantially, driving the reduction in cost at scale.
Beyond this, the LSAC program grantees carry out the majority of their work in the context of large collaborations with different investigator communities (See: Selecting New LSAC Projects, and Active Sequencing Projects below). To achieve this, the LSAC program serves as a venue for these large, multi-party collaborations. In this role, the LSAC grantees provide expertise in project design (for example, number of samples needed to obtain adequate power to detect disease associations) and data processing and analysis capabilities (e.g., providing large-scale variant calling; interpreting associations).
Because the LSAC centers have experience with a broad range of projects and investigator communities, they are in a unique position to investigate basic overarching questions of fundamental significance, for example about the genetic architecture of common disease and cancer and the ability to interpret variants and mutations; and in comparative genomics, about the detectability of evolutionarily constrained sequences, with ramifications for understanding genome function. These insights in turn have implications for the design of efficient large-scale projects, the direction of technology development, development of informatics analysis tools, and ultimately the use of sequence information to understand human disease.
These considerations highlight the role of the LSAC program - a combination of NHGRI staff, grantees, and advisors - in identifying and designing new project types that address the most compelling questions that can be answered as the state of the art in high throughput sequencing changes. NHGRI anticipates that the type and number of important large-scale sequencing projects will continue to expand, requiring new flexibility from this component of the program. These considerations also highlight the connections between the LSAC program and other NHGRI programs that depend on knowledge about disease variation.
A discussion about current priorities for the program is available at: Discovering the genetic basis of human disease as a foundation for genomic medicine. Although this document, derived from a program meeting held in 2013, is not a formal statement of NHGRI priorities, it represents an accurate representation of the current discussion within the GSP about the major scientific issues and priorities for large-scale sequencing circa 2014.
LSAC ProjectsInformation about ongoing projects is available at: Active sequencing projects
NHGRI requires that active projects are accessioned and described in an appropriate repository, for example the NCBI BioProjects pages, dbGaP, the 1000 Genomes website, or CGHub.
An archive describing project/target selection procedures, working groups, and project descriptions that have been pursued over the history of the program is available. These include links to projects and project descriptions, and previous rationales selecting organismal and medical sequencing targets for NHGRI will complete all projects that were committed to under previous iterations of the program, unless specifically ended due to scientific or programmatic considerations.
Selecting New LSAC Projects
The NHGRI LSAC program provides high-throughput sequencing capacity, project design expertise, and analysis capability. In addition, it provides the flexibility to rapidly explore and adopt new methods and strategies. It also provides a venue for the coordination within and between projects, both to facilitate large multi-party collaborations involving the sequencing centers and outside investigators, and to propagate improvements in sequencing and analysis pipelines.
The LSAC is structured to be able to undertake a wide range of project types in order to take advantage of changing scientific opportunities. These projects range from very large (e.g. whole exome sequencing of tens of thousands of samples for case/control human disease studies), to mid-sized human studies (e.g., whole genome sequence of tens-to-hundreds of human samples, for example for family studies or cancer projects), to individual or multiple organismal genome projects. The LSAC also may undertake small pilot studies, for example to understand the feasibility of a larger project, or to implement and optimize new sequencing technologies that are required to explore a specific question or to pilot a large-scale implementation.
To account for the range of scale and type of project, there are several mechanisms used to select new projects:
- Institutional high-priority projects. These large projects are usually initiated through collaborations between NHGRI and other NIH institutes and are usually shaped in consultation with the research community. In many cases, the project has included a separate solicitation for non-production project components sponsored by the other institute (coordination, project design, and analysis). These substantial projects may be tightly organized (e.g., with their own coordinating centers, advisory bodies, and web portals), or more loosely organized.
In addition to the above, LSAC participation in these projects, and project design, are vetted separately in advance by the program scientific advisors (SAP) and the NACHGR. Examples of this type of project include The Cancer Genome Atlas (TCGA) and The Alzheimer's Disease Sequencing Project. The 1000 Genomes Project was selected in a similar way, that is, based on discussions between NHGRI and the international community of scientists and funding bodies, with capacity provided by NHGRI and international partners, and project coordination and analysis funded through separate grant mechanisms from NHGRI and others.
- Center-initiated projects. The original cooperative agreement proposals in response to the solicitation that funded the current round of LSACs included proposals for center initiated projects (CIPs), which were considered in the original review of the proposals. As LSAC centers complete CIP's, they are encouraged to propose new ones consistent with NHGRI scientific and programmatic priorities. Smaller CIPs (usually pilot efforts) are reviewed by NHGRI staff. Larger projects are evaluated by the SAP; if a CIP is substantial and represents a new effort, it is discussed with the NACHGR prior to approval. Most CIPs are initiated by an LSAC center in collaboration with outside investigators.
- Community white papers. At the outset of the current funding period, NHGRI encouraged white papers from the community proposing specific sequencing targets; white papers were evaluated by a separate group of outside scientists and were subject to final approval by the NACHGR. This mechanism was widely used for proposing organismal genome sequence targets. However, as NHGRI program priorities have shifted increasingly towards very large human disease variation studies, the mechanism was largely discontinued as it was deemed ineffective in identifying large projects suitable for the LSACs as their capacity grew. Nonetheless, NHGRI does consider ideas from the community, and occasionally will consider white papers that meet at least the following criteria:
- The project is within NHGRIs programmatic interest, i.e., serves an important strategic objective.
- The scientific goals are significant.
- The project could not be funded through an R01 or similar investigator-initiated grant mechanism.
- The scale of the project is suitable for the LSACs (not too small); the LSACs will not consider individual Mendelian disease projects.
- The project will produce data that will be widely used.
- The strength of the research community that will use the data.
- There are not other projects that are similar (ie, same disease, same samples, same organism) that NHGRI or others are already pursuing; if the white paper is proposing that NHGRI be part of a large coordinated effort then the coordination has to be clear.
- The individuals proposing the white paper must be prepared to work with NHGRI and the LSACs collaboratively. It is expected that final white papers or project plans will be crafted in collaboration with an LSAC.
Current Program Grantees
- The Broad Institute [broadinstitute.org]
- Baylor College of Medicine Human Genome Sequencing Center [hgsc.bcm.tmc.edu]
- Washington University Genome Institute Sequencing Center [genome.wustl.edu]
Last Updated: July 25, 2014