The National Human Genome Research Institute (NHGRI) has begun a year-long planning effort to help guide the institute into the new era of genome research that will follow the completion of the Human Genome Project (HGP) in the spring of 2003. To launch this process, NHGRI invited a group of experts in genomics along with experts in genomics applications to biology, medicine, and the ethical, legal and policy implications, to a retreat at the Airlie Conference Center in Warrenton, Virginia, from December 12-14, 2001. The purpose of the meeting was to identify the key areas in genomics research that will become most important, relevant, and compelling over the next 10 to 20 years. With this information as a foundation, the priorities for NHGRI will be formulated during the course of the subsequent planning process. This report summarizes the major themes that emerged from the meeting.
The National Institutes of Health (NIH) initially created the National Center for Human Genome Research to oversee and manage NIH's contribution to the HGP. The center was later renamed the National Human Genome Research Institute (NHGRI) in recognition of its broader mission resulting from the addition of an intramural program and the realization that genomics would be an ongoing priority in biomedical research beyond the completion of the human genome sequence. The primary goal of the HGP is to determine the DNA sequence of the human genome and make it freely available as a resource to scientists around the world. The HGP produced a working draft of the human genome sequence in 2001 and will complete the human genome sequence in the spring of 2003.
From its outset, the HGP aimed to develop new information, tools and technologies that would enable scientists to gain a deeper understanding of the genetic contributions to disease and to use this knowledge to improve human health for all groups. The imminent completion of the project's initial goals presents a compelling opportunity to focus aggressively on translating the spectacular genomic advances into medical advances. Even with the determination of the human genome sequence and the other achievements of the HGP, there is still a major deficit in the understanding that is needed to connect basic research discoveries with new approaches to disease prevention and therapies. Much additional basic research, guided by a genomic approach, remains to be done to shed light on the many mysteries of biology. At the same time, genome research offers myriad other opportunities for connecting detailed knowledge of the human genetic instruction book with important problems in clinical research, including the causes of health disparities. These paths are not mutually exclusive, and finding the right balance between them, although challenging, will be the most effective approach in the end.
One important link between basic research and health benefits will be provided by the study of human genetic variation and its links to disease. The decision has already been made to pursue development of a haplotype map of the human genome. This map will provide an invaluable resource for finding genes associated with common, complex diseases and identifying the genetic factors in other conditions. It will be essential to deal sensitively with the ethical, legal and social issues that arise when studying the genetic variation among groups of individuals. Research on how to do this and how to explain the results of genetic variation research to the public will also be needed.
Although there are likely to be some early successes, translating basic research into well-designed diagnostics and therapeutics will take years, even decades. Thus, it is important to provide the public and the biomedical community with appropriate expectations regarding this process. In communicating the future of genomics, it is important to stimulate excitement and support, but not over-promise results. The genomics research community needs to help the public understand that the primary importance of the HGP's phenomenal achievement in sequencing the human genome is the beginning of a long journey to advances in human health.
Now that the sequencing of the human genome is approaching completion, an important next step is to extract as much of the information contained in the genome as possible. All of the genes encoded in the genome should be enumerated, a task that is not as simple as once thought. Other data sets of high interest include all of the regulatory elements, all forms of structural RNA, recombination and replication signals, and all encoded proteins and their modified forms. In contrast to genomic sequence, which can in principle be determined completely, there are questions about whether any of these other data sets can be compiled with the same degree of completeness. It is not obvious that completeness is even conceivable in the case of some data sets, such as transcripts (given alternative splicing), regulatory elements, proteins (given isoforms), protein-protein and protein-nucleic acid interactions, and protein modifications. The desire for completeness will need to be balanced by cost-benefit considerations.
Defining and creating the "parts list" will be of enormous utility, but it will not be nearly enough. Beyond knowing what the components are, it will be essential to know what they do. It is increasingly clear that most gene products do not function in an isolated manner, but in molecular complexes made up of two or more components and in co-regulated pathways and larger networks. New conceptual and technological approaches are necessary to elucidate these complex interactions. This will be a daunting task. Unlike the goal of sequencing the human genome, science is still somewhat ignorant about how biological networks should be conceptualized, studied and represented.
New wet-bench and computational methods, and new modes of interaction between these methods, will be required to help solve these problems. Novel databases and mathematical methods will be essential in organizing and relating large quantities of different types of data, while computational models will be needed to describe and understand the corresponding networks. These predictive models will be the basis for designing new approaches to both basic and clinical research. This may well require massive computer capacity. It should be possible to continue to learn useful approaches from colleagues in fields such as engineering and physics, where solutions to similar complex problems have already been developed.
The comparative study of genome structure, function and evolution will represent a central component of genomics research for many years to come. Indeed, it was understood by workshop participants that NHGRI and other funding agencies would continue to support the sequencing of many additional organisms' genomes and that these sequences would provide a wealth of comparative biological and functional information. Since numerous topics relevant to comparative genomics had already been addressed at a prior NHGRI workshop and since a process is now in place for identifying additional species whose genomes will be sequenced, this area of genome research was not extensively discussed at the Airlie meeting.
Translating the output of genome research into insights that can be used for biological and clinical research will demand new developments in computational biology, new types of databases for housing the many new data types that will be generated, and new tools for gaining access to and assimilating the stored data in an efficient fashion. Scientists have made rapid progress in recent years in the early development of computational tools and databases that improve our ability to survey the genomic landscape. At the same time, these tools are admittedly inadequate for the demands that will be placed upon them in the foreseeable future, and further development must keep pace with the accumulation of data. Bottlenecks threaten to arise as, for example, the rate of annotation of the sequence data falls behind the rate of data generation, especially with the lack of adequate vocabularies to describe disease phenotypes. One interesting approach that warrants further exploration is the development of distributed, community-based annotation systems. The establishment of standardized methods that allow scientists to describe phenotypes and to enter data into the databases in a compatible format might also help.
A key factor in achieving the successes of the HGP has been the driving force played by technology development, and the willingness on the part of funding agencies to recognize it as a critical component of the effort. Continued technology development will be essential for further progress. Technology development is expensive and is likely to continue to become increasingly so as our understanding of biology becomes ever more sophisticated. However, the investment in technology development has produced significant pay-offs for genome science in terms of reducing unit costs, making large-scale data production economically feasible, and creating new approaches to previously intractable scientific problems.
Funding technology development raises challenging issues for peer review, for disseminating the new technologies and for interactions between the public and private sectors. These problems must be addressed if genome science is to continue to be a source of new insights into the key biological and biomedical problems we face.
As genome research begins to turn its attention from determining the human sequence to figuring out how to apply this new knowledge to improving human health and understanding biological systems, it will be increasingly important to attempt to define the interaction of genetic and environmental factors in disease. Linkages between the physical and social environments and the genome are complex and difficult to analyze. Genomics will need to integrate with other disciplines, such as epidemiology, social science and health services research, to do this successfully. This will be particularly true in studies of the interplay between genetic variation and environmental factors. Such studies will likely include large cohorts of individuals from geographically unique regions, as well as broader longitudinal studies that pursue many phenotypes simultaneously.
Since its inception, NHGRI has recognized the importance of addressing the difficult ethical, legal and social questions that are raised by genome research and the acquisition of genetic information. Such concerns are likely to increase as practical applications of genomics multiply. It will be increasingly necessary to establish close links between ethical, legal and social implications (ELSI) research, biomedical research and relevant policy issues. The genome research enterprise has an opportunity to make important contributions to the development of new policies.
Genomics is a rapidly expanding and evolving field. The skill sets needed for success are also evolving quickly, making it difficult to articulate, at any given time, a training curriculum that will ensure career success. It is increasingly apparent that the curriculum for biology students needs to be broadened so that they have more skills in the quantitative sciences and other non-biological disciplines. At the same time, there is a need to train new researchers in ELSI areas. There is also a dramatic need to broaden the types of people working in the field to include those trained in disciplines such as engineering, physics, epidemiology and health services research. The notable paucity of genome scientists from minority backgrounds is in need of major attention. Inadequacies in training seem to cut across virtually all genomic disciplines. While "multi-disciplinary" is an overused phrase, one cannot help but use it when contemplating what is needed for the genome scientist, clinician or ELSI researcher of the future. How to train such individuals within the confines of the traditional academic structure will be extremely challenging.
The public, although dependent on the fruits of scientific research, is on the whole scientifically uninformed. To ensure that members of the public have a sufficiently sophisticated understanding to make informed decisions about their own health care and so that they may participate effectively in policy debates and decision-making, they need to have a working understanding of genomics and genetics. Genome science needs to reach out to the general public in every way possible. It is critical that the next generation be genetically literate if the promise of genomics for improving human health is to be realized. There is a need for a public dialogue on the medical, social and ethical dimensions of genomics, but it is not clear how this dialogue should occur.
The challenge before NHGRI over the course of the next year is to define a strategic plan in the context of the very broad and rapid advances in genome science. NHGRI has an opportunity to maximize its impact on the advancing field of genome science and its vast potential for contributing to the improvement of human health. NHGRI needs to define its interests within that landscape and decide which opportunities should receive its focus. NHGRI should continue to take chances on risky and innovative research. As NHGRI moves forward in its planning, it should develop a series of cross-cutting workshops that will explore areas of potential interest in further depth and suggest approaches to defining the next set of priorities.
Last Updated: October 1, 2012