Advanced Sequencing Technology Awards 2010

The National Human Genome Research Institute (NHGRI) has continued its coordinated effort to support the development of technologies to dramatically reduce the cost of DNA sequencing, a move aimed at broadening the applications of genomic information in medical research and health care. The 2010 awards were announced on September 1, 2010 (See: Press Release)

Award Abstracts

Microfluidic DNA Sequencing
Abate, Adam
GnuBIO, Inc., New Haven, Conn.
R43 HG005144
$240,000 (1 year)

Most next-generation DNA sequencing methods have focused on either 1) template amplification followed by massively-parallel sequencing-by-synthesis, or 2) single molecule detection. The first method is now commercially available but suffers from relatively large volumes of expensive reagent usage. The latter method, although not yet commercially available, will have a disadvantage in signal-to-noise and therefore require more sensitive and expensive instrumentation. To avoid these disadvantages, we will use droplet-based microfluidics to sequence DNA. By using microfluidics we limit the amount of reagent required to sequence DNA to less than several milliliters, while still retaining the ability to amplify the template that thereby enables us to use relatively inexpensive and robust detection. Hybridization of short probes will be detected in microfluidic droplets by a shift in fluorescence polarization that distinguishes between bound and free oligo. This removes the requirement for a separation phase to detect hybridization. The method is simple and does not require enzymes. In Phase I we will describe a simple platform and resequencing method that will be scaled in a future Phase II project to enable human genome sequencing for under $1000.

Polony Sequencing and the $1,000 Genome
Edwards, Jeremy S.
University of New Mexico Health Sciences Center, Albuquerque
R01 HG005852
$2,758,000 (3 years)

Collaborators: Susan Atlas, Dimiter Petsev, University of New Mexico

We propose to further develop and utilize ultra-high throughput polony genome sequencing, with the primary goal of generating raw data to re-sequence the human genome in one week (including library prep and sequencing) for less than $1,000. Currently, the technology is well advanced, but further progress is needed to meet our goals. As the critical quantitative milestone of the project, we will report the sequence for a human genome with "a target sequence quality equivalent to, or better than that of the mouse assembly published in December 2002 (Nature 420:520, 2002)". The project has been divided into three specific aims, which are to (1) increase the polony sequencing read length using a cyclic ligation strategy that involves enzymatic cleavage, (2) increase read density by using different clonal amplification strategies, and (3) extend our software capabilities to allow SNP calls from our raw sequence data. Progress towards our goals is at an advanced stage, and we are able to routinely sequence bacterial genomes and are on the verge of sequencing an entire human genome to the required quality level. Overall, we feel there are substantial rewards to be gained by completing the aims described herein. We have extensive experience with the proposed technologies and have a clear path toward the $1,000 genome.

Millikan Sequencing by Label-Free Detection of Nucleotide Incorporation
Farinas, Javier A.
Caerus Molecular Diagnostics, Inc., Los Altos, Calif.
R43 HG005865
$500,000 (2 years)

The feasibility of a label-free technology, Millikan Sequencing, will be evaluated for de novo sequencing of mammalian genomes for under $1,000. This novel sequencing-by-synthesis approach measures the increased charge as nucleotides are added to DNA templates attached to a tethered bead. Opposing electrical, hydrodynamic and entropic forces will be used to measure the bead displacement, which is a function of the length of DNA attached to the bead. Simultaneous detection of an array of millions of beads undergoing chain elongation will allow high-throughput sequencing. Model calculations and preliminary results indicate that this method should enable accurate, long read length and label-free DNA sequencing. The lack of labels leads to negligible reagent costs while the relatively simple optics leads to a low-cost instrument. Long read lengths will result in low genome assembly cost. The much lower per-bead copy number required compared to the 454 system should enable amplification options other than emulsion PCR, such as bridge PCR, making initial sample preparation easier and cheaper. Ultimately, the method could be used on single molecules thereby further reducing sample preparation costs. The aims of the proposed two-year exploratory project are: (1) to demonstrate the ability to sequence DNA using a single tethered bead, and (2) to develop a scheme that would allow simultaneous detection of large bead arrays for high throughput analysis.

Single-Molecule DNA Sequencing with Engineered Nanopores
Ghadiri, M. Reza, Scripps Research Institute, La Jolla, Calif.
R01 HG003709
$5,150,000 (4 years)

Collaborators: Hagan Bayley, University of Oxford, Amit Meller, Boston University

In nanopore strand sequencing, a single strand of DNA moves through a narrow pore and the bases are identified as they pass a reading head. Here, we focus on the remaining tasks required to put into practice strand sequencing with the ±-hemolysin (±HL) protein nanopore. Nanopore sequencing is a rapid real-time technology; it does not require the time-consuming cyclic addition of reagents. After implementing a chip with 106 pores, we expect nanopore sequencing to achieve a 15-minute genome by 2014 with a very short sample preparation time. In addition, nanopore sequencing will be able to identify modified bases and to sequence RNA directly.

Over the past four years, we have made significant progress; we have shown that all four nucleobases can be identified within intact DNA strands and demonstrated real-time singlenucleotide strand translocation driven by DNA polymerase. We are now in a position to integrate these findings, and with a nanopore array, achieve ultrarapid sequencing. In the next funding period, we will: 1. Refine base recognition by using ±HL nanopores, engineered by conventional mutagenesis, unnatural amino acid mutagenesis and targeted chemical modification, to produce DNA reading heads fit for real-time sequencing. 2. Achieve control of strand translocation for nonenzymatic DNA sequencing. The speed of DNA movement will be slowed by the use of rotaxanes, made from small molecules or engineered protein rings, so that bases can be detected by available recording techniques. 3. In a parallel effort, control DNA movement enzymatically by using DNA polymerase. The polymerase will also be employed in two novel sequencing modes, based on nanopore detection of conformational changes associated with nucleobase incorporation. 4. Develop chips containing up to 106 ±HL nanopores. First, the prototype of an optically-detected 106-chip will be developed. Second, ±HL pores will be placed in arrays of apertures that have been bored into a silicon nitride film with an electron beam, thereby avoiding the use of lipid bilayers altogether. In year 4, these crucial aspects of nanopore sequencing will be integrated into an ultrarapid sequencing device.

Ordered Arrays for Advanced Sequencing Systems
Gordon, Steven J.
Intelligent Bio-Systems, Inc., Waltham, Mass.
R44 HG004101
$2,646,000 (2 years)

The advent of next-generation sequencing technologies is allowing researchers to perform studies and make discoveries which previously were not economically or technically feasible. Thus far, however, higher-throughput next-generation sequencing systems are relatively expensive, have relatively long run times and produce relatively short reads thereby limiting their use for diagnostic applications. In this Phase II application, we propose to combine the novel chip fabrication techniques developed in Phase I, the innovative sequencing by synthesis chemistry exclusively licensed from Columbia University, and an automated prototype sequencing instrument to produce an advanced sequencing by synthesis system. This system will be higher throughput and significantly more cost effective than other competing next-generation technologies. During the project, high density chips will be fabricated, the sequencing instrument and chemistry will be optimized and an E. coli genome will be re-sequenced. This system will be capable of producing large amounts of quality sequence data faster and at a lower cost than any other near-term next generation sequencing system. This will make next-generation DNA sequencing technology more accessible to the broad research community.

Direct Real-time Single Molecule DNA Sequencing)
Huang, Xiaohua
R01 HG005096
University of California, San Diego
$803,000 (2 years)

We propose to develop a method for direct real-time sequencing of single DNA molecules from genomic DNA at the speed and accuracy of the natural DNA polymerases using native nucleotides. We will harness the power of the true nanomachines used in DNA replication, the natural DNA polymerases. Unlike the difficult to engineer man-made nanostructures used in nanopore sequencing to distinguish the 4 base types in close proximity and constant fluctuation, DNA polymerases have precise atomic-resolution 3D structures and can synthesize very long DNA molecules with high fidelity and velocity. The error rate of a DNA polymerase with proofreading function could be as low as one in a million bases and a processive polymerase such as phi-29 DNA polymerase can synthesize up to 100,000 bases in a stretch. From the wealth of structural and kinetics studies, it is well known that the fidelity of DNA synthesis is predicated on the exquisite structural complementarity and the numerous specific interactions between the active site of the polymerase protein and the primer/template/nucleotide complex. The dynamic chemo-mechanical or conformational changes accompanying the specific interactions, induced fit, bond cleavage/formation, and template translocation ensure highly accurate and orderly base pairing and incorporation. Our strategy is to engineer sensors onto the surface of the polymerase by protein engineering to monitor the subtle yet distinct conformational changes accompanying the incorporation of each base type. A small distance change (one to tens of angstroms) can be measured precisely with Förster resonance energy transfer (FRET) technique. Multiple FRET pairs or networks placed in strategic residues on the polymerase will be used to monitor the conformational changes in real time (10 times faster than the rate of DNA synthesis). The sensors will provide multi-parametric information on the dynamic structures of the polymerases, which very likely will provide a unique signature for each base type incorporated. Chemical modifications such as methylation on the template DNA could also potentially be detected. With such a method, very long DNA molecules could be sequenced with high fidelity in minutes and a human genome or even epigenome could be sequenced in less than one hour. In this proposal, we aim to investigate whether there is a distinguishable FRET signal associated with the incorporation of each of the four different nucleotides by a DNA polymerase.

Tunnel Junction for Reading All Four Bases with High Discrimination)
Lindsay, Stuart
Arizona State University, Tempe
R21 HG005851
$868,000 (3 years)

We have discovered that distinct tunneling signals can be generated for all four nucleosides (and 5-methyldeoxycytidine) using one pair of tunneling electrodes functionalized with a simple reagent containing a hydrogen-bond donor and a hydrogen bond acceptor. The goals of this proposal are to extend the measurements to nucleotides in aqueous electrolyte, and then to small oligomers. We will quantify the fraction of single-molecule reads and determine the factors that control this fraction with the goal of eliminating signals that come from more than one nucleotide in the gap at a time. We will explore the factors that control the width of the distribution of current signals for all four bases (and 5-methyl C) with the goal of improving the discrimination of a single read. We will measure the fraction of successful reads and characterize the time required for the complex (that gives rise to the signal) to form in the tunnel gap. From these measurements, we will identify improvements needed to increase the readout efficiency and also develop criteria for design of a nanopore sequencing system equipped with tunneling electrodes. The reagents developed during the course of this research will be made available to other research groups developing nanopore sequencers that use electron tunneling as the readout.

Single Molecule Sequencing by Nanopore-induced Photon Emission (SM-SNIPE)
Meller, Amit (Contact); Klapperich, Catherine M.; Weng, Zhiping
Boston University
R01 HG005871
$4,160,000

Other performance site: University of Massachusetts Medical School

Our group has laid the groundwork in developing a unique, nanopore based method for DNA sequencing by nanopore induced photon emission (SNIPE), which utilizes optical detection rather than the more ubiquitous electrical detection. Our approach is superior to other nanopore approaches as the readout does not involve enzymes, parallelization is straightforward, and the readout is non-destructive. In this grant we propose three distinct aims (developed in parallel), which when brought together, will enable DNA sequencing at an unprecedented scale in terms of speed (>2 10^6 bases/s,) and extremely low cost. Our first aim is to dramatically increase the throughput, speed and accuracy of SNIPE. In order to achieve this, we will concentrate our efforts on parallelization of the system through arrays of nanopores (up to 100x100), transformation of the readout from 2 to 4 colors, and increasing the S/B of the readout. Our second Aim is to develop and optimize our proprietary DNA conversion approach, Circular DNA conversion (CDC). We plan on achieving this first though automation and optimization of CDC using a commercially available benchtop system. Post CDC optimization, we plan on developing a microfluidic device capable of converting an entire human genome. Our third Aim is the development of data analysis algorithms needed for base calling, consensus building, sequence assembly, and error proofing. In completing these three aims we will have achieved in developing a radically new, cost-effective DNA sequencing platform, capable of long read lengths, high speed, and high accuracy. This is expected to have a wide-ranging impact on both basic and applied biomedical research and personalized healthcare.

Modeling Macromolecular Transport for Sequencing Technologies
Muthukumar, Murugappan
University of Massachusetts Amherst
R01 HG002776
$804,000 (3 years)

The urgent need to develop revolutionary technologies, for sequencing large DNA molecules quickly and economically, has led to many experimental strategies. Chief among these are the nanopore-based electrophoretic experiments. In these experiments, translocation of single molecules of DNA is monitored as they pass through protein channels and solid-state nanopores under an external electric field. While the results from such experiments are extremely promising towards reaching $1000 genome target, there are many puzzles and the physics of these nanoscopic systems needs to be understood from a fundamental scientific point of view. The proposed research deals with a fundamental understanding of the behavior of DNA in nanopore environments under the influence of electrical and hydrodynamic forces. We will investigate the challenges underlying several key system components in the goal of reducing the cost of sequencing mammalian-sized genomes to $1000. The major challenges deal with the predictability of capture of the target molecule at the nanopore, efficient threading into the pore, and slowing down the translocating molecule through the pore. We will use a combination of statistical mechanics theory, computer simulations, and numerical computation of coupled nonlinear equations to address polymer statistics and dynamics, electrostatics, and hydrodynamics in the phenomena of DNA translocation. The proposed research, while being generally relevant to all nanopore-based experiments, will be hinged specifically on: (a) role of hybridization in translocation through a-hemolysin, MspA, and solid-state pores, (b) enzyme-modulated DNA translocation through channels, and (c) control of capture rate and successful translocation rate of DNA in protein channels and solid-state nanopores.

Base-selective Heavy Atom Labels for Electron Microscopy-based DNA Sequencing
Toste, Dean
University of California, Berkeley
R21 HG005915
$436,000 (2 years)

The development of inexpensive and rapid DNA sequencing technology remains a major challenge of broad scientific interest. Preliminary work at Halcyon Molecular has shown that transmission electron microscopy (TEM) can be used to obtain ultra-fast ultra-low-cost DNA sequences. Since efficient electron scattering to a detector is highly dependent on atomic number (Z), it is possible to label single stranded DNA (ssDNA) with heavy atoms. To test the limits of this trend, we propose a multipronged approach to selectively prepared metal-DNA base pair complexes. Our effort will be synergistic, taking advantage of the experience of the Toste group in organometallic and heavy atom cluster synthesis, and the capabilities of Halcyon Molecular in manipulating DNA and performing TEM. For this proposal, we are focusing on the selective labeling of DNA bases and the development of an appropriate assay to evaluate our success.

Two general synthetic methods will be investigated in order to develop distinct labeling protocols. First, triosmium (Z_Os = 76), tetrairidium (Z_Ir = 77) and trigold (Z_Au = 79) clusters tethered to a group that selectively react with (alkylating reagents) or bind (platinum diamine complexes) purine bases will be explored. Incorporation of gold (Z_Au = 79) and mercury (Z_Hg = 80) atoms through direct metal-metal bonds to the osmium atoms will also be explored. In this case, the labels would appear as intense spots in the TEM spectra. For the complimentary pyrimidine label, osmium tetraoxide bipyridine will be the selective binding agent thymine and cytosine. Using the bipyridine ligand as a scaffold for functionalization, additional osmium, platinum (Z_Pt = 78) or uranium (Z_U = 92) atoms may be incorporated. A linear arrangement of metal atoms would allow a positional vector to be drawn towards the corresponding base.

Proof-of-concept experiments will be performed using nuclear magnetic resonance (NMR) spectroscopy using individual DNA bases. If successful, testing will be performed on single DNA strands and sequenced using TEM. The success of these methods will enable the base-selective labeling of DNA with metal atoms and help develop ultra-fast ultra-low-cost DNA sequencing technology.

The assembly of a whole human genome with our pilot-scale instrument can demonstrate TEM sequencing's potential for high consensus accuracy, extremely long (>150kb) reads, and lack of sequence specific bias in molecule deposition and readout. The subsequent, commercial availability of whole human genome sequencing using this technology (with expected >99.9999% consensus accuracy and completeness in 10 minutes>

Last updated: October 03, 2011