NHGRI logo
Anvil logo

The NHGRI Genomic Data Science Analysis, Visualization and Informatics Lab-space (AnVIL)

Overview

The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) is a cloud-based genomic data sharing and analysis platform.  AnVIL facilitates integration and computing on and across large datasets generated by NHGRI programs, as well as initiatives funded by National Institutes of Health (NIH), or by other agencies that support human genomics research.  AnVIL is a component of the emerging federated data ecosystem and actively collaborates and integrates with other genomic data resources through the adoption of the FAIR (Findable, Accessible, Interoperable, Reusable) principles. AnVIL provides a collaborative environment and interfaces for consortia and researchers. AnVIL  offers training and functionality for users that have limited computational expertise as well as sophisticated data scientist users.

Specifically, the AnVIL resource provides genomic researchers with the following key elements: 

  •  Cloud-based infrastructure and software platform
  •  Shared analysis and computing environment 
  •  Interoperability and compliance with the emerging federated data ecosystem
  •  Cloud services cost control 
  •  Genomic datasets, phenotypes and metadata 
  •  Data access and data security 
  •  User training and outreach 
  •  Incorporation of scientific and technology advance for storing, accessing, sharing and computing on large genomic datasets
  • Overview

    The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) is a cloud-based genomic data sharing and analysis platform.  AnVIL facilitates integration and computing on and across large datasets generated by NHGRI programs, as well as initiatives funded by National Institutes of Health (NIH), or by other agencies that support human genomics research.  AnVIL is a component of the emerging federated data ecosystem and actively collaborates and integrates with other genomic data resources through the adoption of the FAIR (Findable, Accessible, Interoperable, Reusable) principles. AnVIL provides a collaborative environment and interfaces for consortia and researchers. AnVIL  offers training and functionality for users that have limited computational expertise as well as sophisticated data scientist users.

    Specifically, the AnVIL resource provides genomic researchers with the following key elements: 

    •  Cloud-based infrastructure and software platform
    •  Shared analysis and computing environment 
    •  Interoperability and compliance with the emerging federated data ecosystem
    •  Cloud services cost control 
    •  Genomic datasets, phenotypes and metadata 
    •  Data access and data security 
    •  User training and outreach 
    •  Incorporation of scientific and technology advance for storing, accessing, sharing and computing on large genomic datasets

AnVIL Awards

Applications were submitted in response to the NHGRI AnVIL Funding Opportunity Announcement (FOA): RFA-HG-17-011 and two awards were made.

The AnVIL Data Ecosystem - U24HG010262

  • Data Sciences Platform, Broad Institute: Anthony Philippakis (contact PI)
  • Genomics Institute, University of California Santa Cruz: Benedict Paten (PI), David Haussler (co-I)
  • Center for Data Intensive Science, University of Chicago: Robert Grossman (PI)
  • McDonnell Genome Institute, Washington University in St. Louis: Ira Hall (PI), Larson David (co-I)
  • Vanderbilt University Medical Center: Robert Carroll (PI), Joshua Denny (co-I)
  • Institute for Precision Cardiovascular Medicine, American Heart Association: Jennifer Hall (PI)
     

Implementing the Genomic Data Science Analysis, Visualization, and Informatics Lab-Space (ANVIL) - U24HG010263

  • Department of Biology, Johns Hopkins University: Michael Schatz (contact PI), Jeffrey Leek (PI), Enis Afgan (co-I), Kasper Hansen (co-I)
  • Department of Biomedical Engineering, Oregon Health & Sciences University: Jeremy Goecks (PI), Kyle Ellrott (co-I)
  • Huck Institute of the Life Sciences, Pennsylvania State University: Anton Nekrutenko (PI)
  • Department of Biostatistics and Bioinformatics, Roswell Park Cancer Institute: Martin Morgan (PI)
  • Department of Medicine, Brigham & Women’s Hospital: Vincent Carey (PI)
  • Institute for Implementation Science in Population Health, City University of New York: Levi Waldron (PI)
  • AnVIL Awards

    Applications were submitted in response to the NHGRI AnVIL Funding Opportunity Announcement (FOA): RFA-HG-17-011 and two awards were made.

    The AnVIL Data Ecosystem - U24HG010262

    • Data Sciences Platform, Broad Institute: Anthony Philippakis (contact PI)
    • Genomics Institute, University of California Santa Cruz: Benedict Paten (PI), David Haussler (co-I)
    • Center for Data Intensive Science, University of Chicago: Robert Grossman (PI)
    • McDonnell Genome Institute, Washington University in St. Louis: Ira Hall (PI), Larson David (co-I)
    • Vanderbilt University Medical Center: Robert Carroll (PI), Joshua Denny (co-I)
    • Institute for Precision Cardiovascular Medicine, American Heart Association: Jennifer Hall (PI)
       

    Implementing the Genomic Data Science Analysis, Visualization, and Informatics Lab-Space (ANVIL) - U24HG010263

    • Department of Biology, Johns Hopkins University: Michael Schatz (contact PI), Jeffrey Leek (PI), Enis Afgan (co-I), Kasper Hansen (co-I)
    • Department of Biomedical Engineering, Oregon Health & Sciences University: Jeremy Goecks (PI), Kyle Ellrott (co-I)
    • Huck Institute of the Life Sciences, Pennsylvania State University: Anton Nekrutenko (PI)
    • Department of Biostatistics and Bioinformatics, Roswell Park Cancer Institute: Martin Morgan (PI)
    • Department of Medicine, Brigham & Women’s Hospital: Vincent Carey (PI)
    • Institute for Implementation Science in Population Health, City University of New York: Levi Waldron (PI)

Project Sites

Anvil project sites

External Consultant Committee

The External Consultant Committee (ECC) is a non-governing entity comprising a multidisciplinary panel of experts who will assist the National Human Genome Research Institute (NHGRI) in assessing the AnVIL. 

Members of the ECC are:

  • Nadav Ahituv, Ph.D.: University of California, San Francisco 
  • Cinnamon Bloss, Ph.D.: University of California, San Diego
  • Carol Bult, Ph.D.: Jackson Laboratory
  • Karen M. Davis, M.S.: RTI International
  • Sean Davis, M.D., Ph.D.: University of Colorado Denver
  • George Hripcsak, M.D., M.S.: Columbia University
  • Aleksandar Milosavljevic, Ph.D. | Baylor College of Medicine
  • Siddharth Pratap, Ph.D., MS: Meharry Medical College
  • Adam Resnick, Ph.D.: Children’s Hospital of Philadelphia
  • Marylyn Ritchie, Ph.D.: University of Pennsylvania 
  • External Consultant Committee

    The External Consultant Committee (ECC) is a non-governing entity comprising a multidisciplinary panel of experts who will assist the National Human Genome Research Institute (NHGRI) in assessing the AnVIL. 

    Members of the ECC are:

    • Nadav Ahituv, Ph.D.: University of California, San Francisco 
    • Cinnamon Bloss, Ph.D.: University of California, San Diego
    • Carol Bult, Ph.D.: Jackson Laboratory
    • Karen M. Davis, M.S.: RTI International
    • Sean Davis, M.D., Ph.D.: University of Colorado Denver
    • George Hripcsak, M.D., M.S.: Columbia University
    • Aleksandar Milosavljevic, Ph.D. | Baylor College of Medicine
    • Siddharth Pratap, Ph.D., MS: Meharry Medical College
    • Adam Resnick, Ph.D.: Children’s Hospital of Philadelphia
    • Marylyn Ritchie, Ph.D.: University of Pennsylvania 

NIH Cloud Platforms Interoperability

The NIH Cloud Platforms Interoperability (NCPI) effort is working to enable cross-platform authentication and authorization, data discovery and the exchange of datasets, analysis workflows and results to support the creation of a federated genomic data ecosystem. NCPI is a collaboration between NIH representatives, platform team members and researchers running cross-platform research efforts to inform and validate the interoperability approaches. 

Learn more about the NIH Cloud Platform Interoperability Effort.

  • NIH Cloud Platforms Interoperability

    The NIH Cloud Platforms Interoperability (NCPI) effort is working to enable cross-platform authentication and authorization, data discovery and the exchange of datasets, analysis workflows and results to support the creation of a federated genomic data ecosystem. NCPI is a collaboration between NIH representatives, platform team members and researchers running cross-platform research efforts to inform and validate the interoperability approaches. 

    Learn more about the NIH Cloud Platform Interoperability Effort.

Genomic Data Science Community Network

The Genomic Data Science Community Network (GDSCN) is a partnership of educators and researchers at Historically Black Colleges and Universities (HBCUs), Minority Serving Institutions (MSIs), Tribal Colleges and Universities (TCUs), and Community Colleges (CCs) with members of the AnVIL team to broaden the spectrum of diverse institutions active in bioinformatics and genomic data science research.

Learn more about the Genomic Data Science Community Network.

  • Genomic Data Science Community Network

    The Genomic Data Science Community Network (GDSCN) is a partnership of educators and researchers at Historically Black Colleges and Universities (HBCUs), Minority Serving Institutions (MSIs), Tribal Colleges and Universities (TCUs), and Community Colleges (CCs) with members of the AnVIL team to broaden the spectrum of diverse institutions active in bioinformatics and genomic data science research.

    Learn more about the Genomic Data Science Community Network.

AnVIL Cloud Credits Program

The AnVIL Cloud Credits (AC2) Program accepted pilot applications in April 2020 proposing research or training projects relevant to NHGRI’s mission using AnVIL for large-scale data analysis with cloud computing credits.

Learn more about the AnVIL Cloud Credits Program.

Contact Information

For any AnVIL related comments or questions please contact NHGRI at anvil@mail.nih.gov.

Program Staff

Co-Leads

Valentina Di Francesco, M.S.
Valentina Di Francesco, M.S.
  • Lead Program Director Computational Genomics and Data Science
  • Division of Genome Sciences
Ken Wiley, Jr., Ph.D.
Ken Wiley Jr, Ph.D.
  • Program Director
  • Division of Genomic Medicine

Program Directors

Shurjo Sen
Shurjo K. Sen, Ph.D.
  • Program Director
  • Division of Genome Sciences
Chris Wellington, B.S.
Chris Wellington, B.S.
  • Program Director Computational Genomics and Data Science
  • Division of Genome Sciences

Program Analysts

Natalie Kucher
Natalie Kucher
  • Scientific Program Analyst
  • Division of Genome Sciences
Ana Stevens
Ana Stevens
  • Scientific Program Anaylst
  • Division of Genomic Medicine

Policy Analyst

static
Elena M. Ghanaim, M.A.
  • Policy Analyst
  • Policy and Program Analysis Branch

Last updated: June 22, 2021