||Welcome and Opening Remarks Francis S. Collins
|Experience in Comprehensive Annotation
||Michael Ashburner/Suzanna Lewis
||Experimental analysis of computational predictions
||Annotation with high density arrays
||Transcriptome analysis with arrays
||CASP (Critical Assessment of Structure Predicition) Experiment
||Discussion of a proposal to establish a public pilot project on the exhaustive annotation of human DNA sequence
||Mark Guyer, Chair
||A. FUNDAMENTAL QUESTION: Is such a pilot project desirable and workable?
B. A DRAFT PROPOSAL (presented for the purpose of stimulating discussion):
- The pilot project will be performed by an NHGRI-established Research Network, with funding, operations, and common policies to be worked out.
- The pilot project will involve the analysis of ~30 Mb of human genomic sequence (i.e., 1% of the human genome). The aim will be to identify all functional elements within that 30 Mb of DNA.
- Ten 3-Mb target regions will be chosen -- this will ensure that the compositional variation across the human genome is accounted for while, at the same time, involve regions sufficiently large to allow the identification of long-range elements.
- A variety of methods, both computational and experimental, will be utilized. Experimental approaches will include those that provide de novo insight about the presence of functional elements as well as those that aim to validate computational predictions.
- The results of all analyses by any participants in the project will be deposited in a common database and displayed by an appropriate browser. The browser will make the results available to all participants for comparison, to facilitate insights into what combinations of data or methods are most useful, and to enable the evaluation of the various methods by the entire scientific community.
- The database will be a public resource, and all of the data in it will be available for use by anyone according to the following rules:
- The results of any analyses that use data from the database should be submitted to the database and thereby shared publicly.
- Prior to publication by the investigators who generated the data, the data in the database should be considered to be unpublished data and treated according to the scientific community's norms for use of another investigator's unpublished data (i.e., any publication of work that uses data from the database should acknowledge the source of the data and permission to cite the unpublished data should be obtained).
- Upon publication, the data in the database are freely available for use.
- All participants in the Research Network will meet twice a year to discuss progress and evaluate methods.
- More frequent interactions of self-defined and self-organized sub-groups within the Research Network, by electronic means as well as face-to-face meetings and conference calls, will be encouraged.
||Issues to address in order to implement the proposed pilot project
- An initial list of functional elements to identify.
- An initial list of technologies/approaches that will be utilized.
- How will initial participants be recruited? How will additional participants be added?
- Selection of genomic targets.
- Establishment of a database and browser.
- Specifics of the interactions among the participants in the Research Network.