NHGRI logo

National Human Genome Research Institute Workshop
on Resources for Detecting Genetic Variations

December 8-9, 1997


The National Human Genome Research Institute (NHGRI), in collaboration with the Centers for Disease Control and Prevention (CDC) and others, is developing a publicly-available resource of DNA samples and cell lines that can be used to discover DNA sequence polymorphisms in the U.S. population.


Genetic factors appear to contribute to virtually every human disease. Most common diseases are influenced by multiple genes and environmental factors. A dense map of DNA sequence variations should allow the identification of disease alleles even for these complex diseases. Information about DNA sequence variation will have a wide range of application in the analysis of disease and in the development of diagnostic, therapeutic and preventative strategies.

The purpose of this resource is to facilitate the discovery of large numbers of sequence variations in human DNA. This resource is NOT intended to contain sufficient information within it to allow the study of how variation relates to disease or other phenotypes. The variation data that will result from use of this resource can be used subsequently by investigators to study how the variation relates to disease and health in projects specifically aimed at particular diseases or traits.

Much more information is gained when investigators study a common set of samples rather than different sets. The additional information will allow studies of the associations among the sequence variants as well as checks of the quality of the information provided by different methods.


The resource will comprise cell lines and DNA from 450 unrelated individuals, female and male. It is designed to reflect the range of diversity in the human population. In addition to the complete set, there will be predefined subsets with eight, 24, and 90 samples, encompassing the same range of diversity as the complete set. Each subset contains the smaller subsets. The samples will come from the CDC NHANES III study, other existing collections and on-going collection studies.


To maximize the chances of discovering common DNA sequence polymorphisms in the human population, the individuals sampled are U.S. residents who have ancestors from major geographic regions of the world - Europe, Africa, the Americas and Asia. Many Americans have ancestors from more than one region of the world, and such individuals are included in the resource. Each group has a proportion of its ancestry from the various geographic regions of the world. More detailed information about the composition of the resource will be made available when the resource has been completed.

A random sample of U.S. residents would include genomes of mostly European origin. Although each population contains a large part of the world-wide genetic variation, none contains all the variation. To increase the opportunity to discover variation, this resource has been designed to sample individuals with ancestry other than European at more than their frequency in the U.S. population. No attempt has been made, however, to be exhaustive or precisely balanced. The intent of the sampling strategy is to improve the chances of discovering genetic variation, NOT to draw conclusions about relationships among populations.


Information on geographic origin and sex will be collected for each individual sampled to ensure a diverse collection, but once the collection has been made all identifying and phenotypic information will be removed from the individual samples so that links to individual donors will be irreversibly broken. A summary of the distribution of geographic origin and sex will be made available for the complete collection and the predefined subsets as wholes, but no identifiers will be associated with individual samples.


All samples will come from individuals who have provided informed consent to be part of this resource. More individuals will be asked to participate than will be included in the resource; no one will know which individuals are actually in the resource, not even the individuals sampled. The informed consent material explains that the information collected using this resource will be used for studies of genetic variation.


The material in the resource will be available to any investigator, provided that the proposed use of the material has been reviewed by an IRB and approved or designated as exempt.


The repository should be ready to distribute material by October 1998.


The samples will be deposited with the Coriell Institute for Medical Research [coriell.org] in Camden, New Jersey, as part of the National Institute of General Medical Sciences (NIGMS) Human Cell Repository [ccr.coriell.org].


A central database that can receive information on all variants found for each sample is expected to be available by October 1998. Since the samples in the resource are completely anonymous and without identifying information, each sample will be referenced in the database only by a unique sample number. Collecting the data in one place will provide the biomedical research community with ready access to the polymorphisms that are discovered using the resource and will facilitate analysis of the data.

NHGRI Workshop On Resources for Detecting Genetic Variations

December 8-9, 1997


Aravinda Chakravarti
Case Western Reserve University
Cleveland, OH 44106

Linda Burhansstipanov
Director of AMC
Cancer Research Center
Denver, CO 80214

Kenneth Buetow
Fox Chase Cancer Center
Philadelphia, PA 19111-2412

Georgia Dunston
Department of Microbiology
College of Medicine
Howard University
Washington, DC 20059

Jonathan Friedlaender
Department of Anthropology
Temple University
Philadelphia, PA 19122

Bronya Keats
Dept Biometry and Genetics
Louisiana State University Medical Center
New Orleans, LA 70112-1328

Charles Langley
University of California
Davis Center for Population Biology and
Section of Evolution and Ecology
Davis, CA 95616

D. Andrew Merriwether
University of Michigan
Ann Arbor, Ml 48109

John Moore
Department of Anthropology
University of Florida
Gainesville, FL 32611

Robert Nussbaum
Chief, Laboratory Disease Research
National Human Genome Research Institute
National Institutes of Health
Bethesda, MD 20892

Madison Powers
Kennedy Institute of Ethics
Georgetown University
Washington, DC 20057

Edward J. Sondik
Director, National Center for Health Statistics
Centers for Disease Control and Prevention
Hyattsville, MD 20782

Karen Steinberg
Chief, Molecular Biology Branch
National Center for Environmental Health
Centers for Disease Control and Prevention
Chamblee, GA 30341

Diane Wagener
Acting Director, Division of Health Promotion Statistics
National Center for Health Statistics,
Centers for Disease Control and Prevention
Hyattsville, MD 20782

LeRoy Walters
Director, Kennedy Institute of Ethics
Georgetown University
Washington, DC 20057

Bruce Weir
Department of Statistics
North Carolina State University
Raleigh, NC 27695-8203

Kenneth Weiss
Department of Anthropology/Biology
Pennsylvania State University
University Park, PA 16802-3404

Last updated: May 27, 2011