The ENCODE Project: ENCyclopedia Of DNA Elements


ENCODE Funding Opportunities announced January 15, 2016 New

ENCODE FOA Webinar Information New

ENCODE Overview

The National Human Genome Research Institute (NHGRI) launched a public research consortium named ENCODE, the Encyclopedia Of DNA Elements, in September 2003, to carry out a project to identify all functional elements in the human genome sequence. The project started with two components - a pilot phase and a technology development phase.

The pilot phase tested and compared existing methods to rigorously analyze a defined portion of the human genome sequence (See: ENCODE Pilot Project). The conclusions from this pilot project were published in June 2007 in Nature PDF icon and Genome Research [genome.org]. The findings highlighted the success of the project to identify and characterize functional elements in the human genome. The technology development phase also has been a success with the promotion of several new technologies to generate high throughput data on functional elements.

With the success of the initial phases of the ENCODE Project, NHGRI funded new awards in September 2007 to scale the ENCODE Project to a production phase on the entire genome along with additional pilot-scale studies. Like the pilot project, the ENCODE production effort is organized as an open consortium and includes investigators with diverse backgrounds and expertise in the production and analysis of data (See: ENCODE Participants and Projects). This production phase also includes a Data Coordination Center [genome.ucsc.edu] to track, store and display ENCODE data along with a Data Analysis Center to assist in integrated analyses of the data. All data generated by ENCODE participants will be rapidly released into public databases (See: Accessing ENCODE Data) and available through the project's Data Coordination Center.

Meetings and Workshops

ENCODE Publications, Features and Press Releases

Press Releases

ENCODE Consortium Membership

The ENCODE Consortium is composed primarily of scientists who were funded under RFAs released by NHGRI. Other participants have been identified and brought into the Consortium or Analysis Working Group (which leads the integrative analysis of ENCODE data) as appropriate. The Consortium and Analysis Working Group are open to any investigator willing to abide by the criteria for participation established for the ENCODE Project by NHGRI. The ENCODE External Consultants Panel oversees the activities of the Consortium and provides advice and feedback on the Consortium's goals, progress and membership.

Those interested in applying for membership to the ENCODE Consortium or to the ENCODE Analysis Working Group should review the criteria for participation and contact Elise Feingold, Ph.D., or Peter Good, Ph.D. (See: Program Staff).

ENCODE Data Release Policy

NHGRI has designated the ENCODE Project as a community resource project to accelerate access to and use of the data by the entire scientific community. Accordingly, the data release policy is based on the principle of rapid data release to the scientific community.

Accessing ENCODE Data

The data produced by ENCODE Consortium members are deposited to public databases and are available for all to use without restriction. Data linked to the genomic sequence is stored and visualized on the University of California, Santa Cruz browser at www.encodeproject.org. Other, non-sequence based data, like that from microarray studies, are available on public databases such as the Gene Expression Omnibus (GEO) [ncbi.nlm.nih.gov] and ArrayExpress [ebi.ac.uk]. The NHGRI Division of Intramural Research will be developing a "portal" that will function as a single point of entry from which users can search and retrieve data from the ENCODE Project. Data users should abide by the ENCODE Data Release Policy when accessing data produced by ENCODE Consortium members.

Informed Consent for ENCODE Samples

As the ENCODE Project has increased its study of primary cells and tissues, it has begun working on human biological samples that have been explicitly consented for genomic research and unrestricted sharing of genomic data, in order to maximize the accessibility and utility of ENCODE data.  This means that data can be deposited in freely accessible databases, e.g., GEO and the ENCODE Portal and shared without registration or prior approval. 

The ENCODE Consortium has developed sample informed consent language that explicitly asks for 1) consent to genomic research and 2) consent to unrestricted sharing of genomic data.  Below are links to this sample language as well as two examples of IRB-approved consents allowing for release of genomic data to unrestricted, public databases:

These examples provide the research community with information and examples to assist with the development of informed consent processes and consent forms for genomics-related research projects. They are not provided as guidance or as a template promoted by NHGRI, but as a reference to inform investigators and IRBs considering these issues. It is important to tailor consent documents for each individual study.

For general information from NHGRI about the informed consent process in genomics research, including additional sample consent forms, see:  www.genome.gov/informedconsent.

ENCODE Tutorials

A variety of tutorials are available on accessing and using ENCODE human and mouse data for studies on basic biology and human disease.

ENCODE Common Cell Types

Common cell types have been identified by the Consortium to aid in the integration and comparison of data produced by ENCODE participants using different technologies and platforms.

ENCODE Project Requests For Application (RFAs)




The pilot and technology development phases of the ENCODE project were initiated simultaneously in 2003 when NHGRI released Requests For Application (RFAs) for each of these phases. The first RFA for the pilot phase, RFA HG-03-003, entitled Determination of all functional elements in human DNA, solicited applications from those interested in participating in a research network to conduct a pilot project to test and compare existing methods for identifying all of the functional elements in a limited (~1%) region of the human genome. The second RFA, RFA HG-03-004, entitled Technologies to find functional elements in DNA, solicited applications to develop new and improved technologies for the efficient, comprehensive, high-throughput identification and verification of all types of sequence-based functional elements, particularly those other than coding sequences, for which adequate methods do not currently exist.

NHGRI re-released the technology development RFA in 2004 and 2006. RFA HG-04-001, issued in 2004, solicited additional applications with an added emphasis on high-risk, high-payoff projects and on technologies that might be applied to model organism genomes. RFAs HG-07-028 and HG-07-029, issued in 2006, had an added emphasis on methods to identify functional elements in repetitive sequences and on methods than can be used to validate the identity of functional elements using methods independent of the primary mode of detection.

As the initial phase of the ENCODE Project will be completed in September 2007, NHGRI issued RFAs in November 2006 to solicit application for research projects to continue the ENCODE-based analysis of the human genome at both pilot and whole-genome scales. RFA HG-07-030, entitled Creating the Encyclopedia of DNA Elements (ENCODE) in the Human Genome (U01 and U54), solicited applications for research projects to identify functional elements in the entire human genome sequence (for whole-genome scale projects) or in the 1 percent of the genome targeted during the ENCODE pilot phase (for pilot-scale projects). RFA HG-07-031, entitled A Data Coordination Center for the Encyclopedia of DNA Elements (ENCODE) Project (U41) solicited applications to develop, house, and maintain databases to track, store, and provide access to the different types of data generated as part of the ENCODE Project.

ENCODE Program Staff

Program Directors

Elise Feingold, Ph.D.
E-mail: feingole@exchange.nih.gov

Michael Pazin, Ph.D.
E-mail: pazinm@mail.nih.gov

Daniel Gilchrist, Ph.D.
E-mail: daniel.gilchrist@nih.gov

Program Analysts

Julie Coursen
E-mail: julie.coursen@nih.gov

Hannah Naughton
E-mail: hannah.naughton@nih.gov

National Human Genome Research Institute
National Institutes of Health
5635 Fishers Lane
Suite 4076, MSC 9305
Bethesda, MD 20892-9305

Phone:(301) 496-7531
Fax:(301) 480-2770

