National Institutes of Health U.S. Department of Health and Human Services
ENCODE Project Common Cell Types
The Encyclopedia of DNA Elements (ENCODE) Project seeks to identify functional elements in the human genome. To aid in the integration and comparison of data produced using different technologies and platforms, the ENCODE Consortium has designated cell types that will be used by all investigators. These common cell types include both cell lines and primary cell types, and plans are being made to explore the use of primary tissues and embryonic stem (ES) cells.
Cell types were selected largely for practical reasons, including their wide availability, the ability to grow them easily, and their capacity to produce sufficient numbers of cells for use in all technologies being used by ENCODE investigators. Secondary considerations were the diversity in tissue source of the cells, germ layer lineage representation, the availability of existing data generated using the cell type, and coordination with other ongoing projects. Effort was also made to select at least some cell types that have a relatively normal karyotype.
The cell types and rationales for their selection are described below:
GM12878 is a lymphoblastoid cell line produced from the blood of a female donor with northern and western European ancestry by EBV transformation. It was one of the original HapMap cell lines and has been selected by the International HapMap Project for deep sequencing using the Solexa/Illumina platform. This cell line has a relatively normal karyotype and grows well. Choice of this cell line offers potential synergy with the International HapMap Project and genetic variation studies. It represents the mesoderm cell lineage. Cells will be obtained from the Coriell Institute for Medical Research [coriell.org] (Catalog ID GM12878).
K562 is an immortalized cell line produced from a female patient with chronic myelogenous leukemia (CML). It is a widely used model for cell biology, biochemistry, and erythropoiesis. It grows well, is transfectable, and represents the mesoderm linage. Cells will be obtained from the America Type Culture Collection (ATCC) [atcc.org] (ATCC Number CCL-243).
HeLa-S3 is an immortalized cell line that was derived from a cervical cancer patient. It grows extremely well in suspension and is transfectable. It represents the ectoderm lineage. Many data sets were produced using this cell line during the pilot phase of the ENCODE Project. In addition, these cells have been widely used in biochemical and molecular genetic studies of gene function and regulation. Cells will be obtained from the America Type Culture Collection (ATCC) [atcc.org] (ATCC Number CCL-2.2).
HepG2 is a cell line derived from a male patient with liver carcinoma. It is a model system for metabolism disorders and much data on transcriptional regulation have been generated using this cell line. It grows well, is transfectable, and represents the endoderm lineage. Cells will be obtained from the America Type Culture Collection (ATCC) [atcc.org] (ATCC Number HB-8065).
HUVEC (human umbilical vein endothelial cells) have a normal karyotype and are readily expandable to 108-109 cells. They represent the mesoderm lineage. Cells will be obtained from Lonza Biosciences [lonza.com].
Information on Common Resources used by the ENCODE pilot project, including the pilot project target sequences, BAC Clones for ENCODE targets, cell lines and antibodies to DNA-binding proteins can be found at www.genome.gov/12513455.