NHGRI logo

Data Tools and Resources

Researchers at the National Human Genome Research Institute have developed a number of software and analysis tools to help researchers around the world analyze and explore their genomic data. These tools are free and openly accessible to anyone.

Available Resources

ampliconDIVider
ampliconDIVider identifies deletion and insertion variants (DIVs) in DNA amplicons.

bam2mpg
A Bayesian genotype caller for NextGen sequencing data.

BuddySuite
A collection of four independent, yet interrelated, command line programs that facilitate each step in the workflow of sequence discovery, curation, alignment, and phylogenetic reconstruction.

Complementary Pairs Stability Selection for Genome-Wide Association Studies (ComPaSS-GWAS)
An ad-hoc alternative to replication that can reduce type I errors for GWA studies when appropriate replication data are not available.

Conserved Domain-based Prediction (CDPred)
A computational algorithm that is designed to theoretically calculate the effect of substituting an amino acid relative to the reference sequence within functional modules - the protein domains.

GeIST
A set of files and scripts used to detect and annotate MLV integration sites.

GeneLink
A data management system designed to facilitate genetic studies of complex traits.

Genometric Analysis Simulation Program (G.A.S.P.)
A software tool that can generate samples of family data based on user specified genetic models.

r2VIM
A new recurrency-based variable selection method in random forests for genome-wide genetic association studies.

ROMPrev
A software suite for quantitative trait and locus-specific heritability estimation and association testing using the revised ROMP method.

Shimmer
A tool for the detection of genetic alterations in tumors from Next Generation sequence data.

SKIPPY
A tool for scoring exonic variants for features associated with exon skipping and ectopic splice site creation.

SOOP
A tool for the design and selection of overgo probes optimized for high-throughput comparative mapping.

SubmiRine
A software package for predicting microRNA target site variants (miR-TSVs) from clinical genomic data sets.

Tiled Regression Analysis
A software framework for selecting a set of genetic predictors which jointly and independently explain trait variation with an additive regression model.

trieFinder
A tool that rapidly maps sequence tags to RefSeq, UniGene, and genomic sequences, providing output amenable to both transcript quantification and the detection of novel transcripts.

Var-MD
An annotation and analysis tool for next-generation sequencing variants in rare diseases and small pedigrees.

VarSifter
VarSifter is a graphical java program designed to display, sort, filter, and generally sift variation data from massively parallel sequencing experiments.

  • Available Resources

    ampliconDIVider
    ampliconDIVider identifies deletion and insertion variants (DIVs) in DNA amplicons.

    bam2mpg
    A Bayesian genotype caller for NextGen sequencing data.

    BuddySuite
    A collection of four independent, yet interrelated, command line programs that facilitate each step in the workflow of sequence discovery, curation, alignment, and phylogenetic reconstruction.

    Complementary Pairs Stability Selection for Genome-Wide Association Studies (ComPaSS-GWAS)
    An ad-hoc alternative to replication that can reduce type I errors for GWA studies when appropriate replication data are not available.

    Conserved Domain-based Prediction (CDPred)
    A computational algorithm that is designed to theoretically calculate the effect of substituting an amino acid relative to the reference sequence within functional modules - the protein domains.

    GeIST
    A set of files and scripts used to detect and annotate MLV integration sites.

    GeneLink
    A data management system designed to facilitate genetic studies of complex traits.

    Genometric Analysis Simulation Program (G.A.S.P.)
    A software tool that can generate samples of family data based on user specified genetic models.

    r2VIM
    A new recurrency-based variable selection method in random forests for genome-wide genetic association studies.

    ROMPrev
    A software suite for quantitative trait and locus-specific heritability estimation and association testing using the revised ROMP method.

    Shimmer
    A tool for the detection of genetic alterations in tumors from Next Generation sequence data.

    SKIPPY
    A tool for scoring exonic variants for features associated with exon skipping and ectopic splice site creation.

    SOOP
    A tool for the design and selection of overgo probes optimized for high-throughput comparative mapping.

    SubmiRine
    A software package for predicting microRNA target site variants (miR-TSVs) from clinical genomic data sets.

    Tiled Regression Analysis
    A software framework for selecting a set of genetic predictors which jointly and independently explain trait variation with an additive regression model.

    trieFinder
    A tool that rapidly maps sequence tags to RefSeq, UniGene, and genomic sequences, providing output amenable to both transcript quantification and the detection of novel transcripts.

    Var-MD
    An annotation and analysis tool for next-generation sequencing variants in rare diseases and small pedigrees.

    VarSifter
    VarSifter is a graphical java program designed to display, sort, filter, and generally sift variation data from massively parallel sequencing experiments.

Last updated: October 11, 2019