NHGRI logo

Computational Genomics and Data Science Program

Extracting knowledge from data is a defining challenge of science.

Overview

Computational genomics has been an important area of focus for NHGRI since the beginning of the Human Genome Project. Today, however, advances in tools and techniques for data generation are rapidly increasing the amount of data available to researchers, particularly in genomics. This increase requires researchers to rely ever more heavily on computational and data science tools for the storage, management, analysis, and visualization of data. NHGRI’s commitment to computational genomics and data science is NHGRI’s commitment to computational genomics and data science is a key component of the NHGRI 2020 Strategic Vision and is in alignment with the NIH Strategic Plan for Data Science, which provides a roadmap for modernizing the NIH-funded biomedical data science ecosystem.

Read the Genomic Data Science Fact Sheet.

See the Draft 2023-2028 NIH Strategic Plan for Data Science

  • Overview

    Computational genomics has been an important area of focus for NHGRI since the beginning of the Human Genome Project. Today, however, advances in tools and techniques for data generation are rapidly increasing the amount of data available to researchers, particularly in genomics. This increase requires researchers to rely ever more heavily on computational and data science tools for the storage, management, analysis, and visualization of data. NHGRI’s commitment to computational genomics and data science is NHGRI’s commitment to computational genomics and data science is a key component of the NHGRI 2020 Strategic Vision and is in alignment with the NIH Strategic Plan for Data Science, which provides a roadmap for modernizing the NIH-funded biomedical data science ecosystem.

    Read the Genomic Data Science Fact Sheet.

    See the Draft 2023-2028 NIH Strategic Plan for Data Science

NHGRI Support

The NHGRI 2020 Strategic Vision highlights the importance of bioinformatics and computational biology by stating, “all major genomics breakthroughs to date have been accompanied by the development of groundbreaking statistical and computational methods.” See an extensive outline of the 2020 Strategic Vision.

Projects involving a substantial element of computational genomics or data science account for around 30% of NHGRI’s FY2023 budget; these areas are key components of many NHGRI grants and programs.

NHGRI’s support for computational genomics and data science follows the general principles and priorities identified in the FY2023 NHGRI Funding Policy. NHGRI prioritizes funding support on “the development of resources, approaches, and technologies that accelerate and support studies focused on the structure and biology of genomes; functional genomics; the genomics of disease; the implementation and effectiveness of genomic medicine, computational genomics and data science; training, developing, and expanding the diversity of the genomics workforce; and ethical, legal, and social issues related to genomic advances."

  • NHGRI Support

    The NHGRI 2020 Strategic Vision highlights the importance of bioinformatics and computational biology by stating, “all major genomics breakthroughs to date have been accompanied by the development of groundbreaking statistical and computational methods.” See an extensive outline of the 2020 Strategic Vision.

    Projects involving a substantial element of computational genomics or data science account for around 30% of NHGRI’s FY2023 budget; these areas are key components of many NHGRI grants and programs.

    NHGRI’s support for computational genomics and data science follows the general principles and priorities identified in the FY2023 NHGRI Funding Policy. NHGRI prioritizes funding support on “the development of resources, approaches, and technologies that accelerate and support studies focused on the structure and biology of genomes; functional genomics; the genomics of disease; the implementation and effectiveness of genomic medicine, computational genomics and data science; training, developing, and expanding the diversity of the genomics workforce; and ethical, legal, and social issues related to genomic advances."

Program Breadth

The Computational Genomics and Data Science Program (CGDS) supports the development of advanced computational approaches, innovative data analysis tools, and data resources that provide scientific utility across the extramural research programs and divisions. The CGDS program includes a number of managed grants and programs spanning many scientific topics. These grants can be categorized usefully, though neither exhaustively nor perfectly, into three categories: Computational Genomics and Data Science Methods Development, Genomic Data Resources and Informatics Platforms, and Computational Genomics Training and Workforce Development. 

The links below lead to NIH RePORTER, a database that provides information on NIH funded grants and research activities. Each link associated with a category will display the portfolio of FY2023 grants that received funding from the NHGRI Computational Genomics and Data Science Program.

Computational Genomics and Data Science Methods Development
  • Computational Methods for Clinical Genomics: Development and implementation of genomic-based clinical informatics resources and tools that harmonize scalable, sharable and computable inferences of genomic knowledge with clinical practice guidelines. Includes frameworks and collaborative tools that allow researchers to share, analyze and secure genomic data and patient information.
  • Computational Methods for Functional Genomics: Development of novel methods, software and tools to analyze gene regulation, gene expression, epigenetic modifications and methylation data. Includes methods to integrate and interpret across multiple data types.
  • Computational Methods for Genomic Sequencing Data: Development of novel methods, software and tools to process, align, format and visualize genomic sequence reads; perform genome assembly; and extract sequence features. Includes graph-based and other novel approaches for pangenome analysis. Also includes general genomic analysis tools.
  • Computational Methods for Variation and Association Analysis: Development of novel methods, software and tools for identifying and interpreting genetic variation, elucidating the genetic architecture of human traits and disease, and analyzing population and evolutionary level genomic data.
  • Privacy and Security Technologies: Development of novel methods, software and tools to maximalize security in genomic data sharing and storage.
  • General Computational Tools: Any development of novel methods, software, or tools not covered in the other categories. 
Genomic Data Resources and Informatics Platforms
  • Genomic and Phenotypic Measures and Standards: Development of tools and standards to facilitate sharing and analysis of large-scale genomics data, phenotype data and associated metadata. Includes approaches to harmonize phenotypic information for use in genomic analysis, such as incorporating family history information, electronic phenotyping and ontology development.
  • Genomic Community Resources: Development and maintenance of resources that collect, curate, integrate and distribute comprehensive sets of genomic information from humans or biomedically relevant species. Includes software environments to store, share, analyze and visualize genomics data.
Computational Genomics Training and Workforce Development
 
  • Program Breadth

    The Computational Genomics and Data Science Program (CGDS) supports the development of advanced computational approaches, innovative data analysis tools, and data resources that provide scientific utility across the extramural research programs and divisions. The CGDS program includes a number of managed grants and programs spanning many scientific topics. These grants can be categorized usefully, though neither exhaustively nor perfectly, into three categories: Computational Genomics and Data Science Methods Development, Genomic Data Resources and Informatics Platforms, and Computational Genomics Training and Workforce Development. 

    The links below lead to NIH RePORTER, a database that provides information on NIH funded grants and research activities. Each link associated with a category will display the portfolio of FY2023 grants that received funding from the NHGRI Computational Genomics and Data Science Program.

    Computational Genomics and Data Science Methods Development
    • Computational Methods for Clinical Genomics: Development and implementation of genomic-based clinical informatics resources and tools that harmonize scalable, sharable and computable inferences of genomic knowledge with clinical practice guidelines. Includes frameworks and collaborative tools that allow researchers to share, analyze and secure genomic data and patient information.
    • Computational Methods for Functional Genomics: Development of novel methods, software and tools to analyze gene regulation, gene expression, epigenetic modifications and methylation data. Includes methods to integrate and interpret across multiple data types.
    • Computational Methods for Genomic Sequencing Data: Development of novel methods, software and tools to process, align, format and visualize genomic sequence reads; perform genome assembly; and extract sequence features. Includes graph-based and other novel approaches for pangenome analysis. Also includes general genomic analysis tools.
    • Computational Methods for Variation and Association Analysis: Development of novel methods, software and tools for identifying and interpreting genetic variation, elucidating the genetic architecture of human traits and disease, and analyzing population and evolutionary level genomic data.
    • Privacy and Security Technologies: Development of novel methods, software and tools to maximalize security in genomic data sharing and storage.
    • General Computational Tools: Any development of novel methods, software, or tools not covered in the other categories. 
    Genomic Data Resources and Informatics Platforms
    • Genomic and Phenotypic Measures and Standards: Development of tools and standards to facilitate sharing and analysis of large-scale genomics data, phenotype data and associated metadata. Includes approaches to harmonize phenotypic information for use in genomic analysis, such as incorporating family history information, electronic phenotyping and ontology development.
    • Genomic Community Resources: Development and maintenance of resources that collect, curate, integrate and distribute comprehensive sets of genomic information from humans or biomedically relevant species. Includes software environments to store, share, analyze and visualize genomics data.
    Computational Genomics Training and Workforce Development
     

NIH Strategic Plan for Data Science

2023-2028: In December 2023, NIH released a Draft 2023-2028 Strategic Plan for Data Science to solicit public comments. This updated Strategic Plan for Data Science builds on accomplishments from the initial NIH Strategic Plan for Data Science and will prepare NIH to face the acceleration of sophisticated new technologies and address the rapid rise in the quantity and diversity of data. The updated Strategic Plan supports the NIH Policy for Data Management and Sharing and embraces data-driven discovery as a powerful tool to elucidate biological processes and better characterize the health and health consequences of all people. The plan also fosters ethical use of new methodologies arising from artificial intelligence (AI) and machine learning (ML).

More information to come when the final 2023-2028 Strategic Plan for Data Science is released. 

2018-2023: As a result of the rapid changes in biomedical research and information technology, several pressing issues related to the data-resource ecosystem confront NIH and other components of the biomedical research community. To address these challenges, NIH released its first Strategic Plan for Data Science on June 4, 2018, to provide a roadmap for modernizing the NIH-funded biomedical data science ecosystem. In establishing this plan, NIH addresses storing data efficiently and securely; making data usable to as many people as possible (including researchers, institutions, and the public); developing a research workforce poised to capitalize on advances in data science and information technology; and setting policies for productive, efficient, secure, and ethical data use.

  • NIH Strategic Plan for Data Science

    2023-2028: In December 2023, NIH released a Draft 2023-2028 Strategic Plan for Data Science to solicit public comments. This updated Strategic Plan for Data Science builds on accomplishments from the initial NIH Strategic Plan for Data Science and will prepare NIH to face the acceleration of sophisticated new technologies and address the rapid rise in the quantity and diversity of data. The updated Strategic Plan supports the NIH Policy for Data Management and Sharing and embraces data-driven discovery as a powerful tool to elucidate biological processes and better characterize the health and health consequences of all people. The plan also fosters ethical use of new methodologies arising from artificial intelligence (AI) and machine learning (ML).

    More information to come when the final 2023-2028 Strategic Plan for Data Science is released. 

    2018-2023: As a result of the rapid changes in biomedical research and information technology, several pressing issues related to the data-resource ecosystem confront NIH and other components of the biomedical research community. To address these challenges, NIH released its first Strategic Plan for Data Science on June 4, 2018, to provide a roadmap for modernizing the NIH-funded biomedical data science ecosystem. In establishing this plan, NIH addresses storing data efficiently and securely; making data usable to as many people as possible (including researchers, institutions, and the public); developing a research workforce poised to capitalize on advances in data science and information technology; and setting policies for productive, efficient, secure, and ethical data use.

Workshops and Meetings

Funding Opportunities

Investigators interested in submitting applications to NHGRI are encouraged to contact NHGRI program staff before submission to discuss their specific aims and their choice of Funding Opportunity Announcement (FOA). Contact information for NHGRI program staff is at the bottom of this page. 

Active Funding Opportunities

Investigator Initiated Research in Computational Genomics and Data Science (R01 and R21) PAR-25-228 and PAR-25-229, invite applications for a broad range of research efforts in computational genomics, data science, statistics, and bioinformatics relevant to one or both of basic or clinical genomic science, and broadly applicable to human health and disease.

Genomic Resource Grants for Community Resource Projects (U24): PAR-23-124 is tightly focused on supporting major genomic resources, including those in informatics. Potential applicants are strongly encouraged to contact NHGRI Program Staff before developing an application.

Trans-NIH Enhancement and Management of Established Biomedical Data Repositories and Knowledgebases (U24): PAR-23-237 supports the enhancement and maintenance of established, widely used data resources.

Trans-NIH Early-stage Biomedical Data Repositories and Knowledgebases (U24)PAR-23-236 supports the initial development of a data resource or pilot significant modification of an existing resource.

Development and Implementation of Clinical Informatics Tools to Enhance Patients’ Use of Genomic Information (NOSI): NOT-HG-22-011 encourages applications to develop and implement patient-facing genomic-based clinical informatics tools that facilitate or enhance patient-provider electronic communication, patient tracking and registry functions, patient self-management and support, provider electronic prescribing, test tracking, referral tracking, and health care decision-making.

Parent NIH Solicitations: R01 (PA-20-185 and PA-20-183), Parent R21 (PA-20-195 and PA-20-194), and Parent K25 (PA-20-199) solicitations. These investigator-initiated grants allow researchers to target their specific area of science relevant to NHGRI’s mission (per the NHGRI Funding Policy). Other funding opportunities include PAR-21-075, which focuses on research experiences for students seeking a master’s degree. Additionally, NIH funding opportunities for Small Business Innovation Research (SBIR) and Small Business Technology Transfer (STTR) grants can be found at https://sbir.nih.gov/funding.

Broadening Opportunities for Computational Genomics and Data Science Education (UE5): RFA-HG-23-002 supports educational activities that encourage individuals from diverse backgrounds, including those from groups underrepresented in the biomedical and behavioral sciences, to pursue further studies or careers in research. This is a parallel effort with the (expired) RFA-HG-22-002 Educational Hub for Enhancing Diversity in Computational Genomics and Data Science.

Notice of Special Interest (NOSI): Supporting the Exploration of Cloud in NIH-supported Research: NOT-OD-24-078 announces the availability of funds from the Office of Data Science Strategy (ODSS) to NIH-managed or NIH-majority-funded projects that may benefit from using the cloud. The purpose of this announcement is to explore and test potential opportunities for leveraging cloud solutions to enhance existing NIH-activities. 

Notice of Special Interest (NOSI): Promoting Data Reuse for Health Research: NOT-OD-24-096 solicits competitive revision applications that focus on data reuse and secondary data analysis in NIH-funded data repositories and knowledgebases to advance scientific inquiry and address health research questions.

Request for Applications (RFA): Building Sustainable Software Tools for Open Science (R03 Clinical Trial Not Allowed): RFA-OD-24-010 enhances the sustainability and impact of research software tools by enabling the use of best practices and design principles in software development and by leveraging continuing advances in computing. 

Request for Applications (RFA): NIH Research Software Engineer (RSE) Award (R50 Clinical Trials Not Allowed): RFA-OD-24-011 provides salary support for exceptional Research Software Engineers (RSEs) that contribute their skills to the development and dissemination of biomedical, behavioral or health related software, tools, and algorithms as well as to the training or prospective users of these tools. 

Resources for Other NIH Funding Opportunities 

NHGRI's Funding Opportunities page links to various NHGRI funding opportunities and provides instructions for signing up for NHGRI's funding opportunities email list.

The webpage of the Office of Data Science Strategy (ODSS) provides resources and links to various informatics-related funding opportunities across the NIH and other Federal agencies.

Expired Funding Opportunities

PAR-21-254: Investigator Initiated Research in Computational Genomics and Data Science (R01 Clinical Trial Not Allowed)

PAR-21-255: Investigator Initiated Research in Computational Genomics and Data Science (R21 Clinical Trial Not Allowed)

RFA-HG-24-004: ML/AI Tools to Advance Genomic Translational Research (MAGen) - Development Sites (UG3/UH3 Clinical Trials Not Allowed)

RFA-HG-22-020: Limited Competition: The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) (U24 Clinical Trial Not Allowed)

RFA-HG-22-021: The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space Clinical Resource (ACR) (U24 Clinical Trial Not Allowed)

RFA-HG-17-011: The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) (U24)

RFA-HG-22-002: Educational Hub for Enhancing Diversity in Computational Genomics and Data Science (U24 Clinical Trials Not Allowed)

PAR-20-097: Trans-NIH Biomedical Knowledgebase (U24)

PAR-20-089: Trans-NIH Biomedical Data Repository (U24)

 

  • Funding Opportunities

    Investigators interested in submitting applications to NHGRI are encouraged to contact NHGRI program staff before submission to discuss their specific aims and their choice of Funding Opportunity Announcement (FOA). Contact information for NHGRI program staff is at the bottom of this page. 

    Active Funding Opportunities

    Investigator Initiated Research in Computational Genomics and Data Science (R01 and R21) PAR-25-228 and PAR-25-229, invite applications for a broad range of research efforts in computational genomics, data science, statistics, and bioinformatics relevant to one or both of basic or clinical genomic science, and broadly applicable to human health and disease.

    Genomic Resource Grants for Community Resource Projects (U24): PAR-23-124 is tightly focused on supporting major genomic resources, including those in informatics. Potential applicants are strongly encouraged to contact NHGRI Program Staff before developing an application.

    Trans-NIH Enhancement and Management of Established Biomedical Data Repositories and Knowledgebases (U24): PAR-23-237 supports the enhancement and maintenance of established, widely used data resources.

    Trans-NIH Early-stage Biomedical Data Repositories and Knowledgebases (U24)PAR-23-236 supports the initial development of a data resource or pilot significant modification of an existing resource.

    Development and Implementation of Clinical Informatics Tools to Enhance Patients’ Use of Genomic Information (NOSI): NOT-HG-22-011 encourages applications to develop and implement patient-facing genomic-based clinical informatics tools that facilitate or enhance patient-provider electronic communication, patient tracking and registry functions, patient self-management and support, provider electronic prescribing, test tracking, referral tracking, and health care decision-making.

    Parent NIH Solicitations: R01 (PA-20-185 and PA-20-183), Parent R21 (PA-20-195 and PA-20-194), and Parent K25 (PA-20-199) solicitations. These investigator-initiated grants allow researchers to target their specific area of science relevant to NHGRI’s mission (per the NHGRI Funding Policy). Other funding opportunities include PAR-21-075, which focuses on research experiences for students seeking a master’s degree. Additionally, NIH funding opportunities for Small Business Innovation Research (SBIR) and Small Business Technology Transfer (STTR) grants can be found at https://sbir.nih.gov/funding.

    Broadening Opportunities for Computational Genomics and Data Science Education (UE5): RFA-HG-23-002 supports educational activities that encourage individuals from diverse backgrounds, including those from groups underrepresented in the biomedical and behavioral sciences, to pursue further studies or careers in research. This is a parallel effort with the (expired) RFA-HG-22-002 Educational Hub for Enhancing Diversity in Computational Genomics and Data Science.

    Notice of Special Interest (NOSI): Supporting the Exploration of Cloud in NIH-supported Research: NOT-OD-24-078 announces the availability of funds from the Office of Data Science Strategy (ODSS) to NIH-managed or NIH-majority-funded projects that may benefit from using the cloud. The purpose of this announcement is to explore and test potential opportunities for leveraging cloud solutions to enhance existing NIH-activities. 

    Notice of Special Interest (NOSI): Promoting Data Reuse for Health Research: NOT-OD-24-096 solicits competitive revision applications that focus on data reuse and secondary data analysis in NIH-funded data repositories and knowledgebases to advance scientific inquiry and address health research questions.

    Request for Applications (RFA): Building Sustainable Software Tools for Open Science (R03 Clinical Trial Not Allowed): RFA-OD-24-010 enhances the sustainability and impact of research software tools by enabling the use of best practices and design principles in software development and by leveraging continuing advances in computing. 

    Request for Applications (RFA): NIH Research Software Engineer (RSE) Award (R50 Clinical Trials Not Allowed): RFA-OD-24-011 provides salary support for exceptional Research Software Engineers (RSEs) that contribute their skills to the development and dissemination of biomedical, behavioral or health related software, tools, and algorithms as well as to the training or prospective users of these tools. 

    Resources for Other NIH Funding Opportunities 

    NHGRI's Funding Opportunities page links to various NHGRI funding opportunities and provides instructions for signing up for NHGRI's funding opportunities email list.

    The webpage of the Office of Data Science Strategy (ODSS) provides resources and links to various informatics-related funding opportunities across the NIH and other Federal agencies.

    Expired Funding Opportunities

    PAR-21-254: Investigator Initiated Research in Computational Genomics and Data Science (R01 Clinical Trial Not Allowed)

    PAR-21-255: Investigator Initiated Research in Computational Genomics and Data Science (R21 Clinical Trial Not Allowed)

    RFA-HG-24-004: ML/AI Tools to Advance Genomic Translational Research (MAGen) - Development Sites (UG3/UH3 Clinical Trials Not Allowed)

    RFA-HG-22-020: Limited Competition: The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) (U24 Clinical Trial Not Allowed)

    RFA-HG-22-021: The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space Clinical Resource (ACR) (U24 Clinical Trial Not Allowed)

    RFA-HG-17-011: The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) (U24)

    RFA-HG-22-002: Educational Hub for Enhancing Diversity in Computational Genomics and Data Science (U24 Clinical Trials Not Allowed)

    PAR-20-097: Trans-NIH Biomedical Knowledgebase (U24)

    PAR-20-089: Trans-NIH Biomedical Data Repository (U24)

     

Program Staff

Program Directors

Daniel A. Gilchrist, Ph.D.
Daniel A. Gilchrist, Ph.D.
  • Program Director
  • Division of Genome Sciences
Ajay Pillai, Ph.D.
Ajay Pillai, Ph.D.
  • Program Director
  • Division of Genome Sciences
Shurjo Sen
Shurjo K. Sen, Ph.D.
  • Program Director
  • Office of Genomic Data Science
Idan Gabdank
Idan Gabdank, Ph.D.
  • Program Director
  • Division of Genome Sciences
Chris Wellington, B.S.
Chris Wellington, B.S.
  • Program Director, Computational Genomics and Data Science
  • Office of Genomic Data Science
Jean Gao
Jean Gao, Ph.D.
  • Program Director
  • Office of Genomic Data Science

Scientific Program Analysts

Helen Thompson
Helen Thompson, B.A.
  • Program Specialist
  • Office of Genomic Data Science
Nicolas Keller
Nicolas Keller, B.S.
  • Scientific Program Analyst
  • Division of Genome Sciences
Alessandra Serrano-Marroquin
Alessandra L. Serrano Marroquin, B.A.
  • Scientific Program Analyst
  • Division of Genome Sciences
Mike Lopez
Mike Lopez, B.S.
  • Scientific Program Analyst
  • Office of Genomic Data Science

Last updated: November 26, 2024