Active CEGS Awards
Jef D. Boeke
New York University Langone Health
The Center for Synthetic Regulatory Genomics (SyRGe) is tasked with development and application of revolutionary technology for making dramatic, coordinated changes to extensive gene loci, which will enable broad investigation of the function of regulatory sequences and foster translational applications to biotechnology, personalized medicine and gene therapy. Specifically, we will (i) design and synthesize over 1000 >100kbp constructs probing regulatory function at a carefully selected set of genes; (ii) develop a pipeline for their single-copy integration at any desired site genome-wide; (iii) survey libraries for effects on gene expression, chromatin features, and nuclear architecture; and (iv) develop global computational models to predict regulatory effects of sequence variation on gene expression. The Center will dramatically supersede present and predicted technologies for manipulation and assessment of regulatory genome function. This Center also incorporates a unique and highly successful outreach program whereby undergraduates from diverse backgrounds play a crucial role in genome assembly, as well as a new Fellows program to expose researchers and students from other fields to transformative new technology and facilitate its promulgation throughout the larger human genetics and genomics communities.
Tens of thousands of human genomes have been sequenced, but the central challenge is their interpretation. A comprehensive set of regulatory events across a genome — the regulome — is needed to make full use of genomic information, but is currently out of reach for most clinical applications and biological systems. The Center will develop technologies that greatly increase the sensitivity, speed, and comprehensiveness of understanding genome regulation. We will develop new technologies to interrogate the transactions between the genome and regulatory factors, such as proteins and noncoding RNAs from single cells, and integrate variations in DNA sequences and chromatin states over time and across individuals. Novel molecular engineering and biosensor strategies are deployed to encapsulate the desired complex DNA transformations into the probe system, such that the probe system can be directly used on very small human clinical samples and capture genome-wide information in one or two steps. These technologies will be applied to clinical samples with genomic aberrations to exercise their robustness, and reveal for the first time epigenomic dynamics of human diseases during progression and treatment. These technologies will be broadly applicable to many biomedical investigations, and the Center will disseminate the technologies via training and diverse means.
George M. Church
Harvard Medical School
The Center for Genomically Engineered Organs (CGEO) will combine cutting edge genomics, genome editing technology, and tissue engineering methods to develop improved models of complex tissues. These tissues will be producible in laboratories from reprogrammed or genetically modified stem or other cells, will contain multiple cell types and vasculatures representative of natural (healthy or diseased) tissues, and will be characterized deeply at a molecular level and for overall tissue architecture. Such model tissues will greatly expedite biomedical progress by providing researchers a way to conduct preliminary tests of theories about normal and disease biology quickly and inexpensively in their laboratories before they have to move on to costly and potentially invasive experiments on animals or humans. To build the capacity to generate such models, CGEO will develop methods for comprehensively analyzing tissues in situ at a molecular level, by acquiring high-throughput RNA expression, protein expression, and epigenomic data from each of the tissue's individual cells that retains information about the locations of these molecules in the cells. CGEO will develop and use these methods to characterize model tissues important to neurobiology that will be built from neurons of different types derived from human induced pluripotent stem cells and grown into cerebral organoids. Vascularizing these organoids, and perfusing them so as to provide them nutrients and eliminate wastes, will enable them to grow into larger and more mature forms than achieved to date, and thus improve their ability to model natural tissues. In situ molecular data obtained from these neurons and organoids will be compared with data from comparable natural tissues to assess and improve their representativeness. CGEO is a collaboration of four laboratories in the Boston area with combined expertise in advanced genomic and proteomic technology, genome engineering, stem cell technology, epigenetics, super-resolution microscopy, and tissue engineering. The CGEO team comprises Professors George Church (Principal Investigator) and Chao-Ting Wu (both from Harvard Medical School), Ed Boyden (MIT), and Jennifer Lewis (Wyss Institute at Harvard).
University of California, Berkeley
The ability to understand normal and pathologic functions of the human genome and to translate that knowledge into effective therapies depends critically on determining how encoded genetic information confers phenotype. Recent advances in DNA sequencing and bioinformatics have provided vast quantities of genomic data that, in principle, hold the keys to advances in preventive medicine and therapeutic intervention. However, realizing the promise of personalized medicine will require accurate interrogation and manipulation of DNA sequences in situ at a scale and level of accuracy not currently available. The Center for Genome Editing and Recording (CGER) will address these challenges by creating technologies to detect, alter and record the sequence and output of the genome in individual cells and tissues. Building on the CRISPR-Cas9 genome engineering technology harnessed from bacteria, CGER will couple the RNA-guided DNA cleavage activity of the Cas9 enzyme to strategies for enhancing DNA sequence replacement using homology-directed double- strand break repair. In parallel, CGER will conjugate Cas9 to DNA "base editing" domains to enable accurate introduction or correction of point mutations without double-stranded DNA cleavage. Using cell-based assays, CGER researchers will interrogate specific disease-associated loci in human cells to provide new biological insights and uncover new therapeutic targets. Together, these approaches will enable the creation of any desired sequence alteration at any locus with high specificity and efficiency, with profound implications for both genome science and practical therapeutic intervention. To complement this suite of genome- manipulation technologies, CGER will also develop a high-throughput pipeline for testing the functional gene expression impacts of sequence variants responsible for human disease. This pipeline will identify and illuminate the relationships between human genome sequence variations, target gene expression and interactions with other genes. Finally, CGER will create new methods for permanently recording cell state changes in DNA so that they can be read out in a single-cell RNA-seq format. This development of molecular cell recorders will focus primarily on an evolving lineage tracer that, by enabling the generation of fate maps at unprecedented resolutions, holds the promise to revolutionize studies of normal development and diseaseprogression.
University of Washington
To date, millions of human genetic variants have been found, many in the coding or regulatory sequence of genes. However, for only a tiny fraction of these variants do we understand how the expression or function of the encoded product is affected. As a consequence, the promise of sequencing human genomes to understand human phenotypes – especially the risk for many diseases with genetic components – has gone largely unfulfilled. What is needed are facile, high-throughput methods for generating libraries of human cells bearing mutant sequence elements and for assessing these libraries to determine each variant's effect on molecular and cellular phenotypes. Thus, the Center for the Multiplexed Assessment of Phenotype, based largely in the University of Washington's Department of Genome Sciences, proposes to develop highly generalizable, reproducible and scalable technologies to generate, and assess the functional impact of, variants in human genes. In the first specific aim, the Center will establish two workhorse methods of mutagenesis to produce variants: saturation editing of genes at their endogenous loci in the human genome, and in vitro generation of variant libraries that are recombined into safe harbor sites. In the second specific aim, the Center will develop approaches to explore the impact of mutations in noncoding regions on versions of genes that have been minimized – pared down to partially remove intronic sequence but still capable of providing essential activity. Further, it will develop mass spectrometry methods to analyze variation in coding sequences for its effect on protein abundance, stability, interactions, turnover and aggregation. In the third specific aim, the Center will assess variant effects on cell morphology, behavior and internal organization by using a novel, microscopy- based phenotyping technology, and on global transcription by developing a massively parallel single-cell mRNA profiling method. Center-developed technologies will be piloted on a set of human genes with disease relevance, enabling comparisons between each variant's functional effects and the effects of known pathogenic or benign variants. This effort will inform the use in the clinic of the large-scale functional data the Center's technologies will generate. Additionally, variants will be assessed under different conditions, such as in multiple cell lines, in combination with another mutation, or in the presence of a drug. The Center will also train early career experimentalists, clinical geneticists and data scientists to obtain and use large-scale functional data. This training will include internships in Center laboratories for one to three months, and apprenticeships for one to two years. These close interactions will generate medically- and biologically-relevant results and reveal the best paths for translating the vast amounts of Center-generated functional data for clinical use. Through these new technologies and their dissemination to the broader clinical community, the Center will advance the promise of the Human Genome Project by interpreting the vast landscape of human genetic variation.
A phenomics-first resource for interpretation of variants
Oregon State University
Genomics is key to precision medicine; however, despite the ease of sequencing, clinical interpretation is still thwarted because relevant data (disease, phenotype, and variant) is complex, heterogeneous, and disaggregated across sources. Moreover, this evidence is sometimes incomplete, conflicting, and erroneous. Consequently, clinicians face long lists of candidate diseases, genes, and countless variants of unknown significance. This situation will not improve without capturing and harmonizing the underlying phenotypic information; computability of this information is the bedrock for the emerging field of phenomics. From basic science to clinical care, communities need structured ways to represent and exchange phenotypes and disease definitions. Addressing these fundamental phenomics needs makes it possible to computationally assess and reveal links between diseases and variants. We have previously shown how the addition of phenotypic information using the Human Phenotype Ontology (HPO) can improve the diagnostic yield for hard-to-diagnose patients, and HPO is therefore now a global standard for “deep phenotyping”. We have demonstrated the applicability of deep phenotyping in the evaluation of rare diseases which have overlapping mechanistic underpinnings with common/complex diseases as well as evolutionarily conserved mechanisms in model organisms. Having coordinated the community and prototyped the underlying computational platforms, we will now align both phenotype ontologies and clinical terminologies, enabling better comparison and inference of phenotypes for improved diagnostic efficacy. We propose to develop a Phenomics-First Resource (PFR). Specifically we will: 1. Create a community-driven framework of interoperable phenotype definitions across species (uPheno) 2. Harmonize human disease definitions with the MONDO disease alignment resource 3. Create a community-wide exchange standard for clinical and model-organism phenotypes (Phenopackets) 4. Develop an integrated phenomics platform to provide the research (e.g. BioLink) and clinical (FHIR) communities with programmatic access to phenomics ontologies, data, and algorithms The dynamic suite of interlinked technologies will together leverage community-developed knowledge in order to make variant interpretation more reliable, better provenanced, and more clinically actionable.
University of Chicago
Center for Dynamic RNA Epitranscriptomes RM1 HG008935 Chuan He University of Chicago RNA modifications are ubiquitous in biology and present in all classes of cellular RNAs including eukaryotic messenger and long non-coding RNA. A large fraction of mammalian mRNA/lncRNA modifications are also known to be reversible, highly dynamic, and occur in cell type and cell state dependent manner. The dynamic RNA epitranscriptomes, those involving N6-methyladenosine (m6A) in particular, are known to regulate many cellular activities including mRNA splicing, export, cytoplasmic localization, stability, translation activity, microRNA processing, immune tolerance, and to impact cellular processes including proliferation, development, circadian rhythm, and embryonic stem cell differentiation. Consider m6A in mRNA/lncRNA as an example, dedicated writers, erasers, and readers exist in human cells to orchestrate an additional layer of complex post-transcriptional gene expression regulation. Emerging new functions of RNA modifications are expected to follow, with significant implications on many aspects of human health and disease. Despite high potentials and promises, current epitranscriptome studies are significantly hampered by the lack of technologies that enable quantitative mapping of any type of mRNA/lncRNA modifications at high resolution and high sensitivity. This proposal will develop new methods that target mRNA/lncRNA modifications, such as m6A, N1-methyladenosine (m1A), 5-methylcytosine (m5C), and 2'O-methyls (Nm) for high throughput sequencing at single-base resolution and suitable for low input RNA isolated from just hundreds to thousands of cells. New bioinformatics tools will be developed in order to facilitate data analysis. The general approaches proposed can be broadly applied to sequence RNA modifications in other RNA species including more abundant ribosomal RNA, transfer RNA, snRNA, and snoRNA as well as miRNA and piRNA. We will apply the newly developed methods to obtain base-resolution maps of RNA modifications in order to associate with human diseases, and to proof-of-principle studies in neurobiology. Our proposed research will establish high-throughput, high-resolution, and high-sensitivity methods for epitranscriptome research in all biological areas.
University of Pennsylvania
A cell is a highly complex system with distributed molecular physiologies in structured sub- cellular compartments whose interplay with the nuclear genome determine the functional characteristics of the cell. A classic example of distributed genomic processes is found in neurons. Learning and memory requires modulation of individual synapses through RNA localization, localized translation, and localized metabolites such as those from dendritic mitochondria. Dendrites of neurons integrate distributed synaptic signals into both electrical and nuclear transcriptional response. Dysfunction of these distributed genomic functions in neurons can result in a broad spectrum of neuropsychiatric diseases such as bipolar and depressive disorders, autism, among others. Understanding complex genomic interactions within a single cell requires new technologies: we need nano-scale ability to make genome-wide measurements at highly localized compartments and to effect highly localized functional genomic manipulations, especially in live tissues. To address this need, we propose to establish a Center for Sub-Cellular Genomics using neurons as model systems. The center will develop new optical and nanotechnology approaches to isolate sub-cellular scale components for genomic, metabolomics, and lipidomic analyses. The center will also develop new mass spectrometry methods, molecular biology methods, and informatics models to create a platform technology for sub-cellular genomics.
The Duke FUNCTION Center: Pioneering the comprehensive identification of combinatorial noncoding causes of disease
Noncoding genetic variation that alters gene regulatory element activity has major impacts on health, disease, and evolution. Because measuring regulatory element activity has long been a major challenge, the mechanisms underlying thousands of genetic associations with disease remain unknown. Recent advances in high-throughput technologies have disruptively advanced the ability to measure the activity of individual regulatory elements, and the first population- and genome-scale uses of those methods are now underway. However, regulatory elements do not act alone. They interact with promoters, other regulatory elements, and the surrounding chromatin, all in ways that are complex and difficult to predict. Though there are now a plethora of technologies to measure the activity of individual regulatory elements, the ability to recapitulate the effects of combinations of regulatory elements is woefully inadequate and severely hinders efforts to establish the gene regulatory contributions to traits and diseases. The goal of the Duke FUNCTION Center of Excellence in Genomic Science is to make the study of the combinatorial activity of regulatory elements routine. Aim 1 is to develop a suite of new technologies to measure the combinatorial effects of regulatory elements in their endogenous genomic contexts. Those technologies will leverage very recent discoveries of CRISPR enzymes other than Cas9 that greatly expand the ability to manipulate the human genome. Aim 2 is to develop the matched computational, statistical, and evolutionary models needed to interpret and predict the measured effects of combinations of regulatory variants on human traits and diseases. Aim 3 is to demonstrate the broad applicability of the technologies developed through case studies of human diseases with prevalence ranging from common to ultra rare. Example case studies will include studies of schizophrenia, rare recessive disorders, and undiagnosed genetic disorders. We will also use a nationwide request for applications to identify Pilot Projects that will expand applications to other disease areas. Aim 4 is to create an electronic platform for distributing results from functional studies of the noncoding genome to the broad research community. The platform will integrate our results with those from studies in other labs and consortia, such as ENCODE; and will enable researchers with diverse expertise to benefit from the Center. Finally, our Education and Outreach Aim is to expand genomics capacity locally and nationally, and with a particular emphasis on increasing use of our new technologies for translational research. The expected outcome of this project will be a paradigm shift in human genetic and genomics in which it will become possible to finally understand the full regulatory complexity that controls the expression of human genes. We anticipate that ability will be particularly powerful for translating genetic associations into disease mechanisms, thus creating a windfall of new knowledge about which genes contribute most to disease, and how to manipulate those genes for therapeutic benefit. Long term, we envision this work being critical to realizing the full potential of whole genome sequencing to detect causes of disease.
1 RM1 HG011014-01
New York Genome Center
While rapid advances in single-cell RNA-sequencing are yielding comprehensive taxonomies of cell states in the human body, understanding the complex molecular and environmental factors that regulate cell behavior remains a central challenge. New methods for simultaneous measurement of multiple molecular modalities, spatial context, and lineage relationships are needed to address this goal, but are currently outside the scope of present technologies which largely focus on a single data type. We propose to create a Center for Integrated Cellular Analysis, with a mission to develop a comprehensive suite of technologies and analytical methods to measure and integrate the molecular and environmental determinants of cellular identity. To achieve these goals, we propose the following series of synergistic Aims that will be developed in parallel: 1) Develop massively parallel assays to simultaneously profile multiple molecular components across millions of cells; 2) Identify the spatial and environmental determinants of cellular state in complex interacting populations; 3) Develop scalable platforms to profile inherited molecular components, and determine the role of cell lineage in establishing molecular and phenotypic differences across cells; and 4) Develop methods to harmonize single cell profiles across distinct modalities, enabling the inference of cellular identity. Our Center will address critical challenges in data integration, and produce software and protocols that will be applicable to diverse biological systems. We will share these resources broadly with the community, alongside a broader educational focus to encourage New York City students from under-represented backgrounds to pursue academic training in Genomics and Systems Biology.
Center for Genome Imaging
Harvard Medical School
Three-dimensional (3D) genome organization is a major contributor to genome function, and yet, we are only at the very dawn of discovering the structural signatures that underlie that organization. Thus, the goal of the proposed studies is to develop and apply tools that will enable sequence-specific imaging of human genomes, in their entirety, with high genomic resolution. In particular, the proposed work will innovate methods for fixed and live cell imaging using diffraction-limited light microscopy and super-resolution microscopy as well as develop new tools for image analysis and genome modeling. To this end, it will involve the continued collaboration of four laboratories, whose collective breadth of expertise covers the fields of classical and molecular genetics, chromosome dynamics, imaging, Hi-C analysis, convolutional neural networks, and polymer physics-based and restraint-based modeling. An equally important objective of the proposed studies is to ensure a generation of researchers whose personal breadth of expertise will come to match that of the entire current team.
Health relatedness: Will a solid grasp of 3D genome organization have implications for under- standing human development? Will it contribute to the protection of human health? Will it contribute to strategies for early diagnostics and perhaps even the development of new therapies? The answer to all these questions is almost certainly a resounding Yes, as knowledge of 3D genome organization will enhance our capacity to address both fundamental biological processes as well as disease.
Innovation: An abundance of studies argue that genomes function as integrated units and, yet, no extant technologies enable sequence-specific imaging of entire genomes at high genomic resolution. Thus, the capacity of researchers to fathom the interplay between 3D genome organization and genome function has been limited to disjointed snapshots of localized events. Accordingly, first three aims will develop the next tier of tools to put entire genomes within reach. They will advance a new method, OligoFISSEQ, and then integrate it with OligoSTORM and OligoDNA-PAINT to finally achieve high-throughput imaging at both conventional and super-resolution. They will also tackle two genomic features that have been prohibitively difficult to capture – presence of homologs in diploid cells and highly repeated sequences – as well as innovate strategies for high volume data storage, image processing and analysis, and modeling. Finally, a fourth aim will implement methods for disseminating our tools.
- Scaling technologies toward whole genome imaging
- Filling in gaps to visualize chromosomes end-to-end – tackling homologs and repeats
- Probe design, image analysis, modeling, and integration of epigenetic data
- Training, resources, and opportunities for engaging colleagues in whole genome imaging
- Scaling technologies toward whole genome imaging
Previous CEGS Awards
Below is a list of previous Centers of Excellence in Genomic Science (CEGS) grant awards. Grant numbers link to the NIH RePORT system, where abstracts, other information about the awards and resulting publications are included.
While active CEGS have their own institutional websites, some are not maintained once the grant ends. If a website still exists, it will be linked from the project title.
Center for Cell Circuits
The Broad Institute, Cambridge, Massachusetts
Center for Genomically Engineered Organs
George M. Church
Harvard Medical School
Center for Photogenomics
John A. Stamatoyannopoulos
The Altius Institute
Neuropsychiatric Genome-Scale and RDOC Individualized Domains (N-GRID)
Harvard Medical School
Genomic Analysis of Network Perturbations Human Disease
Dana-Farber Cancer Institute, Boston
Center for the Study of Natural Genetic Variation Olson, Maynard V.
University of Washington
Analysis of Human Genome Using Integrated Technologies
Snyder, Michael P.
Yale University / Stanford University
CEGS: Microscale Life Sciences Center
Meldrum, Deirdre R.
University of Washington / Arizona State University-Tempe Campus
Center for Genomic Experimentation and Computation Brent, Roger
VTT/MSI Molecular Sciences Institute
Genomic Basis of Vertebrate Diversity
Talbot, William S. / Kingsley, David M.
Implications of Haplotype Structure in the Human Genome
Waterman, Michael S. / Tavaré, Simon
University of Southern California
Genomic Approaches to Neuronal Diversity and Plasticity
Columbia University Health Sciences
Molecular and Genomic Imaging Center
Church, George M.
Harvard Medical School
Center for the Epigenetics of Common Human Disease
Feinberg, Andrew P.
Johns Hopkins University
Center for in Toto Genomic Analysis of Vertebrate Development
Bronner-Fraser, Marianne / Fraser, Scott E.
California Institute of Technology
Wisconsin Center of Excellence in Genomics Science
Medical College of Wisconsin / Texas Biomedical Research Institute
Causal Transcriptional Consequences of Human Genetic Variation
Church, George M.
Harvard Medical School
An Interdisciplinary Program for Systems Genomics of Complex Behaviors
Pardo-Manuel de Villena, Fernando
University of North Carolina, Chapel Hill
Last updated: July 26, 2021