Division of Genome Sciences

Computational Genomics and Data Science Program


Program Overview

Background

Extracting knowledge from data is a defining challenge of science. Computational genomics has been an important area of focus for NHGRI since the beginning of the Human Genome Project. Today, however, advances in tools and techniques for data generation are rapidly increasing the amount of data available to researchers, particularly in genomics. This increase requires researchers to rely ever more heavily on computational and data science tools for the storage, management, analysis, and visualization of data. NHGRI's commitment to computational genomics and data science is in alignment with the general mission of the NIH Big Data to Knowledge (BD2K) initiative and similar programs. These efforts support research and development of transformative approaches and tools that maximize the integration of Big Data (like genomics data) and data science into biomedical research.

NHGRI Support

The NHGRI 2011 strategic plan identifies bioinformatics and computational biology as a cross-cutting area "broadly relevant and fundamental across the entire spectrum of genomics and genomic medicine." Projects involving a substantial element of computational genomics or data science account for almost a quarter of NHGRI's FY2014 budget; these areas are key components of many NHGRI grants and programs.

NHGRI's support for computational genomics and data science follows the general principles and priorities identified in the NHGRI Funding PolicyPDF file. Particular priority is placed on "approaches generalizable across diseases and biological systems of higher order organisms." Projects focusing on a single disease are less likely to be relevant to NHGRI than those generalizable across multiple diseases.

Top of page

Breadth of the NHGRI Computational Genomics and Data Science Program

Grants supported under this program span many scientific topics, as can be seen through NIH RePORTER. These grants can be categorized usefully, though neither exhaustively nor perfectly, into "Genome Analysis Tools and Software Resources" and "Data Management Resources." This structure is further explained in the text below and illustrated in Figure 1. The program structure described below should be considered as a general and not exclusive framework for organizing grants into broad scientific categories of interest to NHGRI.

The links to NIH RePORTER will return a list of grants in each of these categories.

Genome Analysis Tools, Software and Data Management Resources

Genome Analysis Tools and Software Resources
  • Genetic variation, clinical and phenotype analyses
     
    • Variation and association analyses: These projects seek to develop new and improved methods for interpreting genetic variation, associating variation with phenotypes, and analyzing population data. Key types of genetic variation include single nucleotide polymorphisms (SNPs), insertions and deletions (indels), short tandem repeats (STRs), copy-number variants (CNVs), and structural variants. Associating genetic variation with diseases and traits may require diverse analytical approaches, including analysis of population data, gene-by-gene (GxG) interactions, and gene-by-environment (GxE) interactions.
       
    • Clinical and phenotype analyses: Projects in this area develop new and improved methods for the management and analysis of clinical phenotype and electronic health record (EHR) data.
       
  • Genomic data processing and analysis tools
     
    • Sequencing informatics: These projects develop new and improved methods for processing, aligning, and formatting sequence reads, performing genome assembly, and extracting sequence features.
       
    • Function Analyses: Gene regulation, gene expression, epigenetic modifications, and methylation all shape the relationships between genes and phenotypes. Projects in this category seek to facilitate the use of these and similar datatypes in genomics. This could involve anything from development of new or improved methods for handling diverse datatypes to the development and refinement of mathematical models of networks and pathways to aid in predicting functional effects of variants.
       
    • Comparative genomics and metagenomics sequence analysis: Projects in this area develop new and improved methods for comparative genomics and for metagenomic sequence analysis. Comparative genomics can include, among others, tools for performing whole genome alignment for two or more genomes, assemblies, gene and protein family alignment tools, and orthologous family clustering.
       
    • General genome data analysis tools: This category includes grants performing genome data analysis not covered in the other categories. Topics in this area include, among others, statistical methods for pattern recognition, applications to make genomics analysis more secure and efficient, and other tools to improve the usability and impact of genomics data.
       
  • Informatics platforms for genome analyses: These projects develop informatics systems and integrated computational environments. These software suites and web-based platforms enable the management, analysis, and visualization of genomic data using advanced statistical and informatics approaches.
Data Management Resources

Top of page

Funding Opportunities

Investigators interested in submitting applications to NHGRI are encouraged to contact NHGRI program staff before submission to discuss their specific aims and their choice of Funding Opportunity Announcement (FOA). Contact information for NHGRI program staff is at the bottom of this page.

Research Project Grant (Parent R01 and Parent R21): Many applications are received through the Parent R01 (PA-13-302) and Parent R21 (PA-13-303) solicitations. These investigator-initiated grants allow researchers to target their specific area of science relevant to NHGRI's mission (per the NHGRI Funding PolicyPDF file). Though R01s are accepted by many NIH Institutes and Centers (ICs), NHGRI is one of a limited number of ICs accepting R21s.

BISTI/BD2K Early Stage Development of Technologies in Biomedical Computing, Informatics, and Big Data Science (R01): PA-14-155 is targeted towards initial development of novel software.

BISTI/BD2K Extended Development, Hardening and Dissemination of Technologies in Biomedical Computing, Informatics and Big Data Science (R01): PA-14-156 is tightly focused on improving the core functionality of widely-used software.

Genomic Resource Grants for Community Resource Projects (U41): PAR-14-191 is tightly focused on supporting major genomic resources, including those in informatics. Potential applicants are strongly encouraged to contact NHGRI Program Staff before developing an application.

Predictive Multiscale Models for Biomedical, Biological, Behavioral, Environmental and Clinical Research (U01):  PA-15-085 is focused on supporting the use of multiscale models of genomic information to enhance the understanding of gene regulation.

Initiative to Maximize Research Education in Genomics: Courses (R25): PAR-13-012 supports short courses and conferences, including those covering topics in informatics.

Training and Career Development: NHGRI supports and participates in various funding opportunities for training and career development programs. Detailed information can be found on the NHGRI Training and Career Development page.

Small Business Funding Opportunities

Background information on NIH's small business funding opportunities can be found at https://sbir.nih.gov/.

BISTI/BD2K Early Stage Development of Technologies in Biomedical Computing, Informatics, and Big Data Science (R43/R44): PA-14-154 supports software development under the Small Business Innovation Research (SBIR) mechanism.

BISTI/BD2K Early Stage Development of Technologies in Biomedical Computing, Informatics, and Big Data Science (R41/R42): PA-14-157 supports software development under the Small Business Technology Transfer (STTR) mechanism.

PHS 2014-02 Omnibus Solicitation of the NIH, CDC, FDA and ACF for Small Business Innovation Research Grant Applications (Parent SBIR [R43/R44]): PA-14-071 supports research, including research related to informatics, under the Small Business Innovation Research (SBIR) mechanism.

PHS 2014-02 Omnibus Solicitation of the NIH for Small Business Technology Transfer Grant Applications (Parent STTR [R41/R42]): PA-14-072 supports research, including research related to informatics, under the Small Business Technology Transfer (STTR) mechanism.

Other Relevant NIH Funding Opportunities

NHGRI's Funding Opportunities page links to various NHGRI funding opportunities and provides instructions for signing up for NHGRI's funding opportunities email list.

The trans-NIH Big Data to Knowledge (BD2K) program was launched in 2012 "to enable biomedical research as a digital research enterprise, to facilitate discovery and support new knowledge, and to maximize community engagement." This program has sponsored numerous funding opportunities relevant to informatics.

The webpage of the Biomedical Information Science and Technology Initiative (BISTI) provides links to various informatics-related funding opportunities across NIH and other Federal agencies.

Top of page

Program Staff

Program Directors

Valentina Di Francesco, M.S.
E-mail: Valentina.Difrancesco@nih.gov

Daniel Gilchrist, Ph.D.
E-mail: daniel.gilchrist@nih.gov

Peter Good, Ph.D.
E-mail: goodp@mail.nih.gov

Chris Wellington, B.S.
E-mail: wellingtonc@mail.nih.gov

Program Analyst

Kevin Lee
E-mail: kevin.lee5@nih.gov

Address
National Human Genome Research Institute
National Institutes of Health
5635 Fishers Lane
Suite 4076, MSC 9305
Bethesda, MD 20892-9305

Phone: (301) 496-7531
Fax: (301) 480-2770

Top of page

Last Updated: July 31, 2015