![]() |
|
|
|
Computational and Mathematical Biology Applications to Genomics and Genetics Research: A Review of Trends and Activities in AcademiaContents
I. Executive SummaryThe Human Genome Program (HGP) will produce vast amounts of genomic DNA sequence information within the next five to ten years. This information will be of little value to biologists if the tools to manage and interpret the information are not available and are not user friendly. In order to develop a plan for how the National Human Genome Research Institute (NHGRI) will ensure that these resources are in place, discussions by telephone were held with approximately 15 scientists with backgrounds in mathematics, physics, informatics, statistics, computer science, and molecular biology who are also concerned about these issues. They were all asked to describe barriers/ opportunities that could be addressed by the NHGRI acting individually or in collaboration with other components of the National Institutes of Health (NIH) or the private section. Five areas were identified: infrastructure; career development; career paths in academia; research training; and research. In addition, it was recognized that industry also has a very important role in these areas. Thus, a dialogue with leaders in academia, industry and government was considered appropriate and timely. The following recommendations were offered for consideration: Infrastructure
Curriculum Development
Career Development and Research Training
Research
Outreach
II. BackgroundThe Human Genome Program (HGP) will produce vast amounts of genomic DNA sequence information. The management and interpretation of this information will require 1) appropriate analytical methods, computer tools and information systems for the collection, storage, and distribution of the mapping and sequencing data and 2) a trained cadre of scientists with interdisciplinary skills--those who understand the biological problem at hand and can find solutions by applying skills from other disciplines.(3) Scientific disciplines that are key to the management and interpretation of genome data include computational and mathematical biology and statistics. In the 1995-96 annual progress report, the need to establish bioinformatics as a profession was emphasized. The document identified the problems in establishing a new profession, such as "winning acceptance for a new interdisciplinary specialty in academic institutions (particularly in an era in which resources are not growing) and winning academic acceptance for an application-oriented (as opposed to theory-oriented) discipline." It was noted that some progress is being made in that a few institutions are beginning to establish graduate programs in bioinformatics and the success of the NHGRIs Special Emphasis Research Career Award in supporting the training of a few mathematical and computational biologists. However, these efforts are inadequate given that the large scale genome sequencing efforts in model organisms and humans are ramping up at a rate that will result in tens to hundreds of millions of base pair information in sequence databases. This information will be of little value if the tools to manage and interpret the information are not in place and are not user friendly. Thus, at least two types of experts are needed: 1) individuals with solid backgrounds in the mathematical, physical or computer sciences who also have sufficient knowledge about the biology to understand the challenges and can develop appropriate analytical methods and computer tools and 2) biologists who understand the questions that can be addressed with these data and have a thorough grounding in mathematics, statistics or computer science who can develop user-friendly tools for general use.
III. MethodologyI interviewed by telephone the individuals listed in Appendix A. Most are establishing or attempting to establish departments, programs or foci for computational or mathematical biology within their academic institutions. Each was asked to describe their current situation, address whether or not there is a need to strengthen computational or mathematical biology in academia, and if so what were the barriers, what model programs exist and what NIH mechanisms besides the institutional training grants (T32) and the mentored research scientist development award (K01) should be developed to increase the potential for establishing visible and viable computational and/or mathematical biology programs or departments in academia. A draft of this report was shared with all the interviewees and many of them provided comments. Most of the suggestions were incorporated, however, the author of this report takes full responsibility for its contents. The author also acknowledges that this is a select and not a statistical sampling of views, therefore, some of the suggestions and opinions may represent the interviewees biases. In addition, the opinions of university (with one exception) and industry leaders are not represented in this report. This report is an outgrowth of an internal informal discussion by staff in October 1996.
IV. What is NeededThe interviewees identified five areas that need to be developed or strengthened in order for mathematical and computational biology to thrive as interdisciplinary areas relevant to genomics/genetics research in academia. These are infrastructure, curriculum development, career development, research training and research. Below is a summary discussion about each of these areas. InfrastructureIn order for a new discipline to thrive in academia, it must have an intellectual and fiscal infrastructure in the form of a department. This is the ideal situation. It is probably an accurate statement that at present, there are very few computational or mathematical biology departments in U.S. institutions. Several barriers to establishing departments or programs were identified: 1) Most academic institutions have not as yet recognized computational and mathematical biology as important emerging areas of science worthy of elevation to a department level. 2) The application of mathematical or computer science principles to biology is an expanding discipline that forges interactions between two disciplines (biology and the mathematical or computer sciences) that normally do not interact scientifically and tend to be separated physically and organizationally. 3) In making tenured appointments for individuals in interdisciplinary research, decisions must be made about which departments slot will be used; in times when growth is restricted, this can make such decisions difficult; and 4) The type of interdisciplinary research that is being pursued may not be considered valued in the primary department. For example, most computer science and mathematics departments focus on theoretical rather than applied research. In spite of these barriers, there are several universities that have made some progress in developing a program or focus for mathematical and computational biology. There are a few institutions where the leadership has recognized the importance of this interdiscipline and supports this effort formally (i.e., top down approach). Some examples are the Center for Discrete Mathematics and Theoretical Computer Sciences (DIMACS) (4) at Rutgers University and the University of California, Santa Cruz where the leadership has made interdisciplinary research and bioinformatics a part of the universitys strategic plan. The Department of Biomathematics at the UCLA School of Medicine trains doctoral students in a variety of disciplines including mathematical genetics. The institute/center at Washington University in St. Louis and the University of Pennsylvania are examples of bioinformatics programs being established as a result of the computational biology needs of the current or formerly NHGRI-supported Genome Science and Technology Centers (GESTECS) being located at these institutions (i.e., bottom up approach). The need to have information management systems for laboratory management and data interpretation was the nucleus around which these programs were established. Another arrangement that has proved productive are the current arrangements at the Washington State University and the University of Southern California between highly motivated individual faculty members in the departments of mathematics and biology who work with graduate students interested in interdisciplinary projects(i.e. ad hoc approach). Whereas a scientific discipline is probably better housed in a department, it is clear from the examples above that universities are using other mechanisms to develop interconnections within disciplines outside of departments through the establishment of centers, bureaus, and institutes. One interviewee cautioned about the difficulty of establishing new departments that are amalgamations of two or more disciplines. A counter argument is that if no efforts were made to establish new departments, no new interdisciplinary departments would ever be established in academia. An alternative and still useful model is for graduate students to meet the requirements of an established discipline/department and then use that foundation to pursue an interdisciplinary project in another department. Whereas all the approaches discussed above have worked to train students at the interface of biology and mathematical and computer sciences, they are less than ideal and are tenuous depending on the chairpersons of the collaborating departments and each universitys vision of its future. In order for a new discipline to grow and be stable, there are other requirements that must complement the academic structure--a curriculum specific to that discipline, a recognized career path, quality graduate students and resources to support them, and a strong research program which generates new approaches and technology for the new discipline. Curriculum DevelopmentA curriculum is the intellectual base upon which a new discipline is established and new concepts from different disciplines are integrated. There is a tendency in multi- disciplinary fields to require that candidates learn everything from all related fields rather than synthesize a new curriculum tailored to the needs of the new discipline. The lack of a discipline specific curriculum usually means that an individual will take longer to complete the requirements for a degree. Accordingly, students will be less attracted to enroll in a degree program that requires double course requirements. Curriculum development requires time that most faculty do not have given their teaching, research, administrative/committee and training responsibilities. There are several examples where individuals have developed new interdisciplinary courses, but because of lack of time, the courses, in their opinion, are not as comprehensive as is needed to really convey new approaches and concepts. All interviewees were of the opinion that a mechanism to give faculty release time to develop appropriate curricula and interdisciplinary courses would be extremely useful for the field and for training. Career Paths in AcademiaIndividuals trained in computational or mathematical biology have several options for employment. The primary two are industry and academia. Industry offers better opportunities both in terms of compensation and a career path. Since the goal in industry is producing a product, individuals are hired for their expertise to get a job done without the constraints of needing to conform to the requirements of a home department or a discipline. The career path in academia is more complicated, especially for new untenured faculty. Because they are tenured, senior faculty are able to engage in interdisciplinary research since they have demonstrated their capabilities in their primary scientific discipline. However, as more universities recognize the need to foster interdisciplinary research, this may become less of a problem for untenured faculty. One of the concerns of graduate students and postdoctoral fellows who are interested in interdisciplinary research is which academic department will hire them. One of the interviewees presented the following two examples to illustrate the problems facing young scientists. The first concerns an individual whose undergraduate degree is in biology. He became involved with the use of computers in molecular biology, gained considerable experience in this area, and now wants to get a Ph.D. The question for him is how/where? After much discussion and soul-searching, he opted to pursue a degree in computer science. He has passed the departments course requirements and now must choose a thesis topic. He is struggling with should it be a traditional computer science project as understood by computer scientists or should it be relevant to biology? The dilemma is what can be very valuable research for biologists, and in a sense innovative, may not involve any new theoretical concepts in new computer science research. According to the interviewee, the individual is still working through these issues and the departmental structure makes it very difficult for him to make a decision. The second case is an individual with two Ph.D.s, one in mathematics and one in electrical engineering/computer science, who is now working on a genome related project and has been extremely productive. He would like to stay in academia, but not as a research associate. He is an excellent researcher and would be an asset to many programs. The problem is which department? Can he hope to get an appointment in a department of mathematics or computer science that will welcome him to work on algorithm development in computational biology? The experience of this interviewee is that it will not be easy, but he plans to do whatever is necessary to help this person secure an appropriate academic position in a first-rate university. These two cases would not be problematic if interdisciplinary research were recognized as a legitimate research area either in a biology or computer science department. Research Training and Career DevelopmentTraining programs provide the academic structure by which graduate students and postdoctoral fellows learn the fundamental concepts of science and have the opportunity to test hypotheses to increase the intellectual basis of the field. The interviewees unanimously agreed that more individuals needed to be trained through organized and well supported research training programs. There were at least three barriers identified in securing interdisciplinary training grants. One was the requirement that the applicant have well documented and established relationships between faculty in collaborating departments. Many of the interviewees spoke of the difficulty of new training programs meeting this eligibility requirements primarily because of the amount of time it takes to interest faculty in other departments to make a real commitment to interdisciplinary research. However, once faculty members do become involved, usually because of the value added to their own research, the interactions are very productive for faculty, graduate students and postdoctoral fellows. The second was that the stipends paid to non-biologists tended to be significantly higher than stipends paid to biologists. The stipend level of postdoctoral fellows with degrees in computer science or mathematics with less than two years of experience ranges between $35,000 and $42,000. The National Research Service Award stipends for postdoctoral fellows range from $20,292 to $32,300. The latter rate is for postdoctoral fellows who have seven or more years of training beyond the doctorate degree. Graduate student stipends are $11,496. These stipends are geared more to the support of biologists, rather than non-biologists. Thus, trying to attract non-biologists to training programs at these stipend levels is very difficult, if not impossible. The third was that the new NIH policy of limiting tuition costs on training grants (5) will make it difficult for institutions to start new or maintain existing training programs. Another area of discussion was what should be the undergraduate background of graduate students trained in computational or mathematical biology. Many interviewees were of the opinion that it would be more desirable to recruit into these areas graduate students with undergraduate degrees in mathematics, statistics or computer science, rather than biology. The reason for this position was that a strong foundation in mathematical concepts is difficult to acquire late in the educational process. Such students would be given sufficient training (didactic and practical) in biology, but not to the same intensity as that required for graduate students/postdoctoral fellows in biology. Again, the emphasis would be on developing a appropriate curriculum. Not everyone interviewed was in agreement about the type of undergraduate background necessary for computational or mathematical biology. It was noted that excellence can be achieved in many ways and that the perspective of those trained in biology, but who have been cross-trained in the mathematical and computer sciences is also important. In fact, many of the current leaders in the field of computational and mathematical biology today are individuals whose doctoral degree is in one of the specialties in biology. One of the interviewees suggested that the role of mathematics in biology extends beyond the HGP and into other disciplines in biology and thus other components of NIH should also be considering the establishment of interdisciplinary training programs. Given the role that mathematics and computational biology will play in molecular medicine, i.e., the identification of all or most of the genes causing disease and one of the several factors in the common diseases, MD/Ph.D training program should also expand training opportunities in these areas. The NHGRIs Special Emphasis Research Career Award (K01) was established in 1991 to recruit individuals with formal backgrounds in mathematics, computer science, chemistry, physics and engineering to pursue genomics research. Approximately 3-4 awards are made annually. All of the awardees have as mentors genome researchers. Most of the interviewees had not heard about this program, but were enthusiastic about this type of award as well as an institution-type award that would support a critical mass of individuals to work in the are of computational or mathematical biology projects in their institutions. NHGRI staff expressed concern that because of the demand and high pay, many individuals who have been trained on government funds would opt for employment in industry rather than remaining in academia. Most interviewees did not view that as a problem. In many instances they cited colleagues who are periodically offered more lucrative positions in industry, but instead opted for academic freedom, the opportunity to train students, and the ability to pursue their own research interest. ResearchIn order for a new scientific field to establish intellectual independence and to be strong in graduate training, an intense, stable research program is essential. Several problems were identified as barriers to getting research projects in computational and mathematical biology established. A major concern was the scientific peer review of interdisciplinary projects. In the opinion of many interviewees, study sections as presently constituted were not always capable of reviewing interdisciplinary research projects. Short project periods were also considered disruptive to research activities. Developing new concepts or applying concepts to new problems usually requires more than two years to demonstrate feasibility or progress. A three year grant in essence gives the principal investigator approximately two years to demonstrate success. A three-year grant also makes it difficult to recruit postdoctoral fellows to work on the project, because the tenuousness of support in future years. In several cases, interviewees were told that an NIH institute/center/division was not interested in supporting their research at that particular time. After briefly discussing their proposed research, NHGRI staff was of the opinion that the research appeared appropriate to one or several NIH components. One interviewee suggested that funds be used to support individuals through research grants (R01s) rather than research career development (K) awards. The rationale is that individuals who receive salary support for career development may not be successful in obtaining peer-reviewed funds at the end of their award period, whereas if you fund research projects, the principal investigator has demonstrated her/his potential to generate new research findings in the field and the research project could serve as a means of training graduate students and postdoctoral fellows.
V. A Role for IndustryMost of the individuals interviewed stressed the importance of industry supporting, in a substantial way, the development and maintenance of strong foci of computational and mathematical biology in academia for several reasons. First, industry has been very successful in recruiting trained individuals at all levels to work in industry. As the large-scale DNA genomic sequencing effort ramps up, there will be an ever increasing need for individuals who can manage and interpret the data that will be the platform upon which research in industry is pursued for the purposes of prevention, treatment and cure of diseases. Second, academia is usually the place where innovative, risky technologies are developed which are then used by industry. To drain trained personnel from academia without efforts to replace and increase the number of individuals involved in intellectual pursuits will eventually result in loss of adequate human resources to feed the genetics revolution. Thus, for industry to partner with academia to ensure that there are sufficiently trained personnel to develop new knowledge is a must. There are some commercial enterprises that do contribute to this effort, but the level of commitment and the duration of the commitment is unknown. Also , it was stressed that industrial funds committed should be unrestricted to give the institutions the needed flexibility to use the funds to strengthen its research effort where and when appropriate.
VI. What is AvailableBefore developing new programs, it is important to document what is available and to determine whether there are model programs in computational and mathematical biology should be replicated. The following list of programs, while not representative of all that is available, probably represents the major efforts in this area. The contributions from industry are not presented because there was no easy way to document or ascertain this information. The identifiable programs could be divided into three categories: 1) infrastructure; 2) career development; and 3) research training. Support of these activities is primarily through foundations and the federal government. InfrastructureThe Whitaker Foundation(6) [whitaker.org] The Foundation's Biomedical Engineering Development Awards are designed to create centers of excellence in biomedical engineering education by establishing or enhancing academic programs. Typical grants have three elements: a start-up award of up to $1 million (capital needs, such as renovations and laboratory enhancements), annual awards up to $500,000 for four years with an optional two-year extension (faculty salaries and graduate student support), and a continuation award of up to $1 million (strengthens the academic program). This award requires an affiliation between engineering programs and graduate or medical schools. Career DevelopmentThe Charles E. Culpeper Foundation's Scholarships in Medical Science [goldmanpartnerships.org] National Human Genome Research Institute's Mentored Scientist Development Award [grants2.nih.gov] Research TrainingBurroughs Wellcome Fund's Interfaces between the Physical/Chemical/Computational Sciences and the Biological Sciences [bwfund.org] Alexander Hollaender Distinguished Postdoctoral Fellowships [orau.gov] Alfred P. Sloan Foundation and U. S. Department of Energy Postdoctoral Fellowships in Computational Molecular Biology [sloan.org] The Whitaker Foundation Graduate Fellowship Program [whitaker.org] National Science Foundation [nsf.gov] Several Directorates at NSF, Mathematical and Physical Sciences and Computer and Information Sciences and Engineering, support interdisciplinary training in the biological sciences. Howard Hughes Medical Institute Graduate Fellowship Program [hhmi.org] National Library of Medicine's Fellowship in Applied Informatics [nlm.nih.gov] National Human Genome Research Institutes Institutional Training Grant in Genomic Sciences [grants1.nih.gov]
VII. RecommendationsThe following recommendations are distilled from the discussions with the interviewees. Staff suggests that these recommendations serve as the starting point of a discussion with leaders in academia, industry and non-profits. There are clearly some areas where new mechanisms can be established, but the success of computational and mathematical biology depends upon developing a strategy in which all parties that have a collected vested interest in the area are brought together to discuss what needs to be done, who will/can do what, and how resources can be leveraged, once there has been an agreement that an opportunity exists to provide stable support to a new discipline. Infrastructure
Curriculum Development
Career Development and Research Training
Research
Outreach
Immediate Action Items
Appendix IntervieweesRuss B. Altman, MD, Ph.D. (Medical Information Sciences) Michael Boehnke, Ph.D. (Biomathematics) Dan Davison, Ph.D. (Biological Sciences-Genetics) Keith A. Dunker, Ph.D. (Biophysics) Philip Green, Ph.D (Mathematics) David Haussler, Ph.D. (Computer Science) Edward Holmes, MD Webb Miller, Ph .D. (Mathematics) Chris Overton, Ph.D. (Biophysics), MSE (Computer Science) Neil Risch, Ph.D. (Biomathematics) Fred Roberts, Ph.D. (Mathematics) Temple Smith, Ph.D. (Physics) Terence P. Speed, Ph.D. (Mathematics) David States, MD, Ph.D. (Biophysics) Gary Stormo, Ph.D. (Molecular Biology) Clark Tibbetts, Ph.D. (Biophysics/Chemistry) Michael Waterman, Ph.D. (Statistics)
Footnotes
Last Reviewed: April 2006 |
|
|
|