These sequencing targets were chosen to enhance the ability to identify conserved elements in mammalian genomes, especially the human genome. The list of organisms for this initiative was chosen to optimize the total evolutionary branch length in order to best identify functional sequences in the human genome.
The rationale for choice of species so far is explained at Summaries of Working Group Proposals.
To pursue this initiative, the NHGRI large-scale sequencing centers will initially sequence the genomes of 24 species to low (~two-fold) sequence redundancy. Researchers have shown that the use of a low redundancy sequence of a large number of genomes may be more effective for finding conserved genomic regions than the use of a smaller number of high quality sequences. As the data are produced, they will be evaluated for their ability to aid in detection of conserved sequences. In general, the greater the available branch-length represented in the genome sequences, the better will be our ability to detect smaller and smaller regions of conserved sequence, with greater accuracy. The current list of 24 mammals is expected to resolve conserved regions of ~6 bp, with a false-positive rate of 10-4. Based on the results of ongoing analyses, additional species may be chosen, or additional data may be added for species sequenced to low-redundancy.
The first set of low coverage mammals was proposed in the original Annotating the Human Genome Working Group proposal.
Two subsequent proposals expanded the list of mammals approved for sequencing.
Later proposals to increase the coverage of mammals from the initially proposed sets were approved in 2006 and 2008.
For an updated list of the genomes that are actively being sequenced or that have been completed recently, please see Approved Sequencing Targets.
Adam Felsenfeld, Ph.D.
Jane Peterson, Ph.D.
Associate Director, Division of Extramural Research
Last Reviewed: November 15, 2010