Agenda for Long-Term Maintenance of Genome Sequence Assemblies: Human and Other Genomes

National Human Genome Research Institute

National Institutes of Health
U.S. Department of Health and Human Services


Long-Term Maintenance of Genome Sequence Assemblies:
Human and Other Genomes

National Human Genome Research Institute

Hyatt at Dulles International Airport
Herndon, Va.
Nobember 8-9, 2004

Agenda

Monday, November 8, 2004
6:30-7:30 p.m. Registration
7:30-8:00 p.m. Introductory Remarks
Francis Collins, NHGRI
Bill Gelbart, Harvard University
8:00-9:30 p.m. Maintaining genomes as state-of-the-art assemblies (Jane Rogers)
5- to 10-minute presentations on ┐the problem┐ from the perspective of:
The sequencing centers: Jane Rogers and Eric Lander
The sequence repositories: Jim Ostell
The model organism databases: Mike Cherry
"Power users": central annotators/browsers: Ewan Birney
The downstream user community: Tom Gingeras
The funding agencies: Adam Felsenfeld
Discussion: What other challenges do we need to anticipate?

Tuesday, November 9, 2004
8:00-8:30 a.m. Continental Breakfast
8:30-10:00 a.m. Panel discussion: What to do about human and other finished genomes (Ewan Birney)

The long-term maintenance challenges for finished genomes are likely to be similar; they have similar (or at least proportionate) levels of support for sequencing and annotation, and they are or will be of high enough quality that they will be relatively stable over time. This session will emphasize the human genome but also bring in examples from other finished genomes.

Panelists: G. Schuler, R. Gibbs, R. Wilson, B. Gelbart, R. Young, J. Weissenbach

What is a "reference" finished genome assembly?
How is it likely to change over time?
Does it serve the needs of the community?
How does the community have feedback into the status of the human and other finished genomes?
How is community feedback evaluated or otherwise dealt with?
Who has the right/responsibility to revise the sequence?
How will the finished reference sequence enter the ┐public domain┐? What is a good ┐stopping point┐ for the sequencing center on a particular finished genome?
Assume a sequencing center completely abandons all future work on a finished genome┐no further assembly, annotation, etc. Would this be a problem? If so, why?
Is there an existing working model that will work for long-term maintenance for one of the finished model organism genomes?

10:00-10:15 a.m. Break
10:15-11:45 a.m. Panel discussion: What to do about genomes that will remain draft assemblies (Richard Myers)

Draft assemblies pose a special set of challenges for long-term maintenance relating to quality and stability. They are of a range of qualities. With the addition of future data, they may improve. With the advent of better assemblers, they will improve, and there may be multiple versions that are difficult to distinguish. How has the problem been addressed for draft genomes?

Panelists: H. Jacob, G. Myers, M. Brent, D. Church, D. Rokhsar, R. Waterston

How do we ensure the best version of a draft genome is presented on a continuing basis?
How should multiple assemblies of the same data be presented to the community?
What are the effects of the conflicting needs for stability and currency?
How will a draft sequence enter the ┐public domain┐?
What is a good ┐stopping point┐ for the sequencing center?
How would improvements to the genomes be incorporated after that?
What is the role of the model organism databases in ensuring long-term maintenance of draft genomes?
Do they have the resources to evaluate assemblies?
Should anything be done differently in the special case where a draft genome of an organism is closely related to a finished one?
What should be the responsibilities of the specific organism community in ensuring long-term maintenance of draft genomes?
To what extent has each model organism group had to deal with these questions?
Is there already a successful model that can be used by others?

12 noon-1:00 p.m. Annotation needs for finished and draft genomes (Tim Hubbard)

This workshop does not aim to comprehensively address long-term challenges in annotation. However, primary annotation is so intimately connected to genome sequence that the effects on annotation need to be given additional consideration. This session will be a series of talks from those involved in annotation about what sequence information they need in order to do annotation, what pitfalls to avoid in organizing a long-term maintenance effort, and what organization or requirements they have to make their tasks possible. A range of annotation activities will be highlighted, from initial automated annotation to curated annotation to higher level annotation.

10-minute presentations:

  1. Robert Kuhn
  2. Kim Pruitt
  3. Richard Durbin
  4. Monte Westerfield

1:00-2:30 p.m. Lunch and Executive Session
2:00-4:00 p.m. Presentation of Proposal and Discussion (Bill Gelbart and Jane Rogers)

How should the responsibilities for long-term curation, especially maintenance, be fulfilled?
What are the right mechanisms? What are the different solutions for finished and draft genomes?
What do the funding agencies need to do?
What do the sequencing centers need to do?
What do the databases and annotators need to do?
Does the proposal address all the issues raised?

4:00 p.m. Adjourn

Top of page

Last Updated: September 21, 2012