NHGRI logo
Anvil logo

The NHGRI Genomic Data Science Analysis, Visualization and Informatics Lab-space (AnVIL)

Overview

The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) is a scalable and interoperable resource for the genomic scientific community, that leverages a cloud-based infrastructure for democratizing genomic data access, sharing and computing across large genomic, and genomic-related data sets.

The AnVIL facilitates integration and computing on and across large datasets generated by NHGRI programs, as well as initiatives funded by National Institutes of Health (NIH), or by other agencies that support human genomics research. In addition, the AnVIL is a component of the emerging federated data ecosystem and is expected to collaborate and integrate with other genomic data resources through the adoption of the FAIR (Findable, Accessible, Interoperable, Reusable) principles, as their specifications emerge from the scientific community. The AnVIL provides a collaborative environment, where datasets and analysis workflows can be shared within a consortium and be prepared for public release to the broad scientific community through AnVIL user interfaces. The AnVIL will be tailored for users that have limited computational expertise as well as sophisticated data scientist users.

Specifically, the AnVIL resource will provide genomic researchers with the following key elements: 

  • Cloud-based infrastructure and software platform
  • Shared analysis and computing environment 
  • Interoperability and compliance with the emerging federated data ecosystem
  • Cloud services cost control 
  • Genomic datasets, phenotypes and metadata 
  • Data access and data security 
  • User training and outreach 
  • Incorporation of scientific and technology advance for storing, accessing, sharing and computing on large genomic datasets
     

For AnVIL related comments or questions please contact NHGRI at: anvil@mail.nih.gov

  • Overview

    The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) is a scalable and interoperable resource for the genomic scientific community, that leverages a cloud-based infrastructure for democratizing genomic data access, sharing and computing across large genomic, and genomic-related data sets.

    The AnVIL facilitates integration and computing on and across large datasets generated by NHGRI programs, as well as initiatives funded by National Institutes of Health (NIH), or by other agencies that support human genomics research. In addition, the AnVIL is a component of the emerging federated data ecosystem and is expected to collaborate and integrate with other genomic data resources through the adoption of the FAIR (Findable, Accessible, Interoperable, Reusable) principles, as their specifications emerge from the scientific community. The AnVIL provides a collaborative environment, where datasets and analysis workflows can be shared within a consortium and be prepared for public release to the broad scientific community through AnVIL user interfaces. The AnVIL will be tailored for users that have limited computational expertise as well as sophisticated data scientist users.

    Specifically, the AnVIL resource will provide genomic researchers with the following key elements: 

    • Cloud-based infrastructure and software platform
    • Shared analysis and computing environment 
    • Interoperability and compliance with the emerging federated data ecosystem
    • Cloud services cost control 
    • Genomic datasets, phenotypes and metadata 
    • Data access and data security 
    • User training and outreach 
    • Incorporation of scientific and technology advance for storing, accessing, sharing and computing on large genomic datasets
       

    For AnVIL related comments or questions please contact NHGRI at: anvil@mail.nih.gov

AnVIL Awards

Applications were submitted in response to the NHGRI AnVIL Funding Opportunity Announcement (FOA): RFA-HG-17-011 and two awards were made.

The AnVIL Data Ecosystem - U24HG010262

  • Data Sciences Platform, Broad Institute: Anthony Philippakis (contact PI), Daniel MacArthur (co-I)
  • Genomics Institute, University of California Santa Cruz: Benedict Paten (PI), David Haussler (co-I)
  • Center for Data Intensive Science, University of Chicago: Robert Grossman (PI)
  • McDonnell Genome Institute, Washington University in St. Louis: Ira Hall (PI), Larson David (co-I)
  • Vanderbilt University Medical Center: Robert Carroll (PI), Joshua Denny (co-I)
  • Institute for Precision Cardiovascular Medicine, American Heart Association: Jennifer Hall (PI)
     

Implementing the Genomic Data Science Analysis, Visualization, and Informatics Lab-Space (ANVIL) - U24HG010263

  • Department of Biology, Johns Hopkins University: Mike Schatz (contact PI), Jeffrey Leek (PI), Michael Schatz (PI), Enis Afgan (co-I), Kasper Hansen (co-I)
  • Department of Biomedical Engineering, Oregon Health & Sciences University: Jeremy Goecks (PI), Kyle Ellrott (co-I)
  • Huck Institute of the Life Sciences, Pennsylvania State University: Anton Nekrutenko (PI)
  • Department of Biostatistics and Bioinformatics, Roswell Park Cancer Institute: Martin Morgan (PI)
  • Department of Medicine, Brigham & Women’s Hospital: Vincent Carey (PI)
  • Institute for Implementation Science in Population Health, City University of New York: Levi Waldron (PI)
  • AnVIL Awards

    Applications were submitted in response to the NHGRI AnVIL Funding Opportunity Announcement (FOA): RFA-HG-17-011 and two awards were made.

    The AnVIL Data Ecosystem - U24HG010262

    • Data Sciences Platform, Broad Institute: Anthony Philippakis (contact PI), Daniel MacArthur (co-I)
    • Genomics Institute, University of California Santa Cruz: Benedict Paten (PI), David Haussler (co-I)
    • Center for Data Intensive Science, University of Chicago: Robert Grossman (PI)
    • McDonnell Genome Institute, Washington University in St. Louis: Ira Hall (PI), Larson David (co-I)
    • Vanderbilt University Medical Center: Robert Carroll (PI), Joshua Denny (co-I)
    • Institute for Precision Cardiovascular Medicine, American Heart Association: Jennifer Hall (PI)
       

    Implementing the Genomic Data Science Analysis, Visualization, and Informatics Lab-Space (ANVIL) - U24HG010263

    • Department of Biology, Johns Hopkins University: Mike Schatz (contact PI), Jeffrey Leek (PI), Michael Schatz (PI), Enis Afgan (co-I), Kasper Hansen (co-I)
    • Department of Biomedical Engineering, Oregon Health & Sciences University: Jeremy Goecks (PI), Kyle Ellrott (co-I)
    • Huck Institute of the Life Sciences, Pennsylvania State University: Anton Nekrutenko (PI)
    • Department of Biostatistics and Bioinformatics, Roswell Park Cancer Institute: Martin Morgan (PI)
    • Department of Medicine, Brigham & Women’s Hospital: Vincent Carey (PI)
    • Institute for Implementation Science in Population Health, City University of New York: Levi Waldron (PI)

Project Sites

Anvil project sites

External Consultant Committee

The External Consultant Committee (ECC) is a non-governing entity comprising a multidisciplinary panel of experts who will assist the National Human Genome Research Institute (NHGRI) in assessing the AnVIL. 

Members of the ECC are:

•    George Hripcsak, M.D., M.S.: Columbia University
•    Cinnamon Bloss, Ph.D.: University of California, San Diego
•    Carol Bult, Ph.D.: Jackson Laboratory
•    Nadav Ahituv, Ph.D.: University of California, San Francisco 
•    Mark Gerstein, Ph.D.: Yale University
•    Siddharth Pratap, Ph.D., MS: Meharry Medical College
•    Marylyn Ritchie, Ph.D.: University of Pennsylvania 
•    Adam Resnick, Ph.D.: Children’s Hospital of Philadelphia 
•    Karen M. Davis, M.S.: RTI International

  • External Consultant Committee

    The External Consultant Committee (ECC) is a non-governing entity comprising a multidisciplinary panel of experts who will assist the National Human Genome Research Institute (NHGRI) in assessing the AnVIL. 

    Members of the ECC are:

    •    George Hripcsak, M.D., M.S.: Columbia University
    •    Cinnamon Bloss, Ph.D.: University of California, San Diego
    •    Carol Bult, Ph.D.: Jackson Laboratory
    •    Nadav Ahituv, Ph.D.: University of California, San Francisco 
    •    Mark Gerstein, Ph.D.: Yale University
    •    Siddharth Pratap, Ph.D., MS: Meharry Medical College
    •    Marylyn Ritchie, Ph.D.: University of Pennsylvania 
    •    Adam Resnick, Ph.D.: Children’s Hospital of Philadelphia 
    •    Karen M. Davis, M.S.: RTI International

Job Opportunities

Interoperability of NIH Cloud-Based Platforms for Genomics Research
NHGRI seeks a data or computer scientist to provide expertise to address the technical interoperability challenges of siloed, cloud-based platforms that host and make broadly available controlled-access human genomic and phenotypic (e.g., disease status) data. The Scholar will coordinate with NHGRI staff to identify interoperability projects across several NIH data platforms, contribute to the projects’ specifications, collaborate with the developers of the AnVIL platform to implement the specifications, and test new functionalities across the platforms to provide immediate, technically informed feedback to NHGRI and other NIH Institutes involved in the interoperability projects.

  • Job Opportunities

    Interoperability of NIH Cloud-Based Platforms for Genomics Research
    NHGRI seeks a data or computer scientist to provide expertise to address the technical interoperability challenges of siloed, cloud-based platforms that host and make broadly available controlled-access human genomic and phenotypic (e.g., disease status) data. The Scholar will coordinate with NHGRI staff to identify interoperability projects across several NIH data platforms, contribute to the projects’ specifications, collaborate with the developers of the AnVIL platform to implement the specifications, and test new functionalities across the platforms to provide immediate, technically informed feedback to NHGRI and other NIH Institutes involved in the interoperability projects.

Contact Information

For any AnVIL related comments or questions please contact NHGRI at anvil@mail.nih.gov.

Program Staff

Co-Leads

Valentina Di Francesco, M.S.
Valentina Di Francesco, M.S.
  • Lead Program Director Computational Genomics and Data Science
  • Division of Genome Sciences
Ken Wiley, Jr., Ph.D.
Ken Wiley Jr, Ph.D.
  • Program Director
  • Division of Genomic Medicine

Program Directors

Shurjo Sen
Shurjo K. Sen, Ph.D.
  • Program Director
  • Division of Genome Sciences
Chris Wellington, B.S.
Chris Wellington, B.S.
  • Program Director Computational Genomics and Data Science
  • Division of Genome Sciences

Program Analyst

Joanna C. Chau
Joanna C. Chau
  • Scientific Program Analyst
  • Division of Genomic Medicine

Policy Analyst

static
Elena M. Ghanaim, M.A.
  • Policy Analyst
  • Policy and Program Analysis Branch

Last updated: May 29, 2020