Advanced Sequencing Technology Awards 2014

The National Human Genome Research Institute (NHGRI) has continued its coordinated effort to support the development of technologies to dramatically reduce the cost of DNA sequencing, a move aimed at broadening the applications of genomic information in medical research and health care. The 2014 awards were announced on August 1, 2014 (See: NIH awards $14.5 million to research groups studying newest DNA sequencing techniques).

Award Abstracts

Optimization of Nanopore Genomic DNA Sequencing
Akeson, Mark A.
University of California Santa Cruz
R01 HG007827
$761,000 FY14, $2,286,000 (3 years)

This proposal concerns optimization of enzymes, pores, and computational methods for single molecule sequencing of genomic DNA fragments. It is based on a proven nanopore device implemented by our group at UCSC. This device is comprised of a sensor that touches and examines each nucleotide within a captured DNA strand as a processive enzyme motor advances the strand. Although the overall goal of nanopore sequencing is de novo reads on very long strands, here we will also focus on resequencing of DNA from organisms important in basic research (mouse, E. coli & Arabidopsis) and in healthcare (human). We are focusing on both de novo and resequencing for two reasons: 1) nanopore sequencing of biological DNA has not been documented publicly. Therefore, nanopore resequencing of reference standards is required for community acceptance, and, importantly, to reveal weaknesses in the technology that impact de novo sequencing accuracy; 2) nanopore resequencing in this application means reading genomic DNA directly and therefore will include epigenetic modifications. This would be an immediate, important contribution to the research community.

Hybrid Nanopore Platform for DNA Sequencing

Boyanov, Boyan
Illumina, Inc., San Diego, California
R21 HG007833
$296,000 FY14, $592,000 (2 years)

The objective of the proposed research is to demonstrate hybrid nanopores combining top-down patterning and nanofabrication with bottom-up biological assembly. Existing nanopore sequencing, especially biological nanopores in lipid bilayers, achieve high speed and read length at the cost of decreased parallelization and data throughput. The fragility of the platform also makes it less practical in commercialization. The primary goal of this project is to establish a hybrid biological-solid state structure that can be self-assembled in a robust automated manner with high efficiency. Our preliminary research has identified plausible carriers of biological channels that can be assembled onto ~10nm solid-state nanopore. The proposed work will focus on demonstration of nanopore protein incorporation, self-limited channel insertion, and demonstration of biological nanopore activity with the hybrid platform. The proposed platform retains the speed and read length characteristics of biological nanopore systems while dramatically increasing the parallelization and system stability by leveraging established semiconductor technologies.

High-bandwidth DNA Sequencing using Graphene Nanoribbon-nanopore Devices
Drndic, Marija
University of Pennsylvania, Philadelphia
R21 HG007856
$440,000 FY14, $880,000 (2 years)

We propose to demonstrate proof-of-principle single DNA base discrimination by harnessing the one-atom thickness and electrical properties of graphene (as thin as the separation between nucleotides). A direct readout of the DNA sequence would be possible by measuring the modulation of the current flowing through a single-layer graphene nanoribbon (GNR), induced by each base in a single-stranded DNA molecule as it passes through a nanopore (NP) in that GNR. This geometry is anticipated to exhibit large electrical current changes for each nucleotide base due to the unique electrostatic potential associated with each nucleotide. These potentials modulate the charge density in the narrow GNR, altering the corresponding GNR current levels. Current levels ~ mA or higher are predicted, potentially enabling high-bandwidth sequencing. This approach particularly addresses the three key obstacles to nanopore-based sequencing: 1) our approach circumvents the need to slow down DNA motion through the pore, 2) the predicted differences in electronic current for each base are large enough that we anticipate the signal-to-noise ratio will be large enough for base discrimination, even at this native speed, and 3) the sequence readout method is compatible with multiplexed detection, which will increase throughput and reduce cost. Important feasibility tests have already been realized in our group. We tested 20 - 200 nm wide single-layer GNRs with NPs at the GNR edges carrying up to 10 mA in 1 mM to 1M KCl solution at bandwidths as high as 100 MHz. We also developed a way to drill NPs without lowering the GNR conductance and observed correlated GNR and ionic signals during dsDNA translocation. We anticipate that single-base resolution will be achievable at currently reported DNA translocation speeds (10⁶ bases/s). This eliminates the need for custom high-speed ultralow noise electronics, as many off-the-shelf photodiode amplifiers for fiber-optics are designed for these current and bandwidth ranges. We propose two Specific Aims for a two-year R21 effort:

1. Optimize GNR device parameters and measurement conditions to achieve graphene nanoribbon signals of high magnitudes (> hundreds of nA's) that are correlated with ionic translocations during DNA translocation.

2. Perform experiments experiments with single-stranded DNA molecules to measure graphene nanoribbon signals and prove single-base discrimination capabilities of this approach.

Enzyme Switch: Many Reporter Molecules from a Single-molecule-sequencing Product
Farinas, Javier
Caerus Molecular Diagnostics, Inc.,
R43 HG007843
$234,000 FY14, $701,000 (3 years)

Even as the cost and throughput of commercial sequencers has continued to improve over the last 5 years, there is still a need to further reduce sequencing costs, to increase throughput and sequencing accuracy and to reduce the costs associated with sample preparation. Single molecule methods such as the Pacific Biosciences or nanopore technologies have the potential to reduce sample preparation bottlenecks but suffer from very high raw error rates. We are developing the Activator Sequencing technology for single molecule sequencing with low error rates. The method is applicable to a variety of read outs such as fluorescence, luminescence, pH sensing and electrochemistry, many of which can be used in a disposable CMOS chip platform similar to that of Ion Torrent. If successful, Activator Sequencing would enable low-cost, long read length, high accuracy sequencing on a scalable platform capable of leveraging semiconductor industry know-how and investments to yield continued yearly increases in performance based on Moore's Law type decreases in feature size.

Activator Sequencing uses a "molecular amplifier" to convert the products of a single-molecule sequencing reaction into many copies of a readily detectable reporter molecule. Specifically, sequencing-by-synthesis is performed using dNTPs labeled at the terminal phosphate with an enzyme activator. Upon incorporation of a dNTP onto a primed template, an activator is released which can turn an engineered enzyme switch from an "off" to an "on" conformation. Each activated enzyme can rapidly generate a multitude of detectable products thereby amplifying the detectable signal from the original dNTP incorporation. For example, while the Ion Torrent system needs many template copies to generate a detectable pH signal, an activator released from a single dNTP molecule can turn on a single enzyme molecule to generate tens of thousands of protons in a few seconds. The generation of multiple copies of a reporter makes it easier to detect nucleotide incorporation thereby allowing single molecule sequencing with low noise. Such single molecule sequencing would simplify sample preparation and enable very long read lengths by eliminating dephasing limitations. If combined with low-cost, highly parallel CMOS sensors, instrumentation costs would be greatly reduced compared to fluorescence instrumentation. Our preliminary results demonstrate that an engineered enzyme switch can function as such a "molecular amplifier." The proposed Phase I SBIR grant will demonstrate the ability of Activator Sequencing to use an engineered enzyme switch to perform single molecule sequencing with high accuracy using fluorescence detection. Future work would focus on transferring the technology to a scalable, integrated CMOS sensor.

Single-Molecule DNA Sequencing with Engineered Nanopores [THIS IS A RENEWAL]
Ghadiri, M. Reza
Scripps Research Institute
R01 HG003709
$1,111,000 FY14, $4,397,000 (4 years)

Collaborator: Hagan Bayley, University of Oxford

Important technical problems in nanopore sequencing have been overcome within the last five years, culminating in a practicable device with significant advantages, including the ability to sequence DNA strands 100,000 bases in length. Nevertheless, an individual nanopore would take more than a year to sequence a human genome. To sequence a genome in minutes, it is essential to sequence many thousands of DNAs in parallel. Our proposal addresses that issue by developing new methods to produce and monitor nanopore sequencing arrays.

We will explore three general means to form arrays. First, we will examine arrays involving droplet interface bilayers (DIB). DIB arrays will be based on aqueous-aqueous, aqueous-hydrogel or hydrogel-hydrogel interfaces, and will be monitored by electrical or optical recording. In the latter case, each bilayer in the array will contain multiple functional nanopores allowing a substantial increase in the rate of data acquisition. New lipid chemistry designed to stabilize the arrays will be a critical aspect of this approach. Second, bilayer-free systems will be fabricated by depositing protein pores in apertures in thin solid-state films, notably silicon nitride. New chemistry for the derivatization of the surface oxide layer on silicon nitride will be developed to modify the apertures to accommodate the pores and to prevent current leaks between the pores and the aperture walls. Third, DNA nanostructures will be employed to build arrays. Nanopores suitable for sequencing applications will be constructed from DNA for use with either DIB or solid-state arrays. DNA nanopores or protein nanopores will also be attached to DNA tiles or scaffolds designed to maintain a pore-to-pore spacing suitable for optical detection. Finally, we will investigate nucleobase detection techniques compatible with the three classes of arrays. Advances in parallel electrical detection will be exploited. Optical approaches designed to greatly increase the number of pores that can be monitored will also be explored, including means to increase the field of view by using lensless wide-field detection. More speculatively super-resolution approaches will be investigated to determine whether the spacing between pores can be decreased into the sub-µm range.

Our proposed studies build on strong preliminary data and our expertise in chemistry and chemical biology to develop new approaches to advance massively parallel nanopore sequencing. The sequencing technologies proposed here promise to deliver chips containing 10⁴, and possibly 10⁶ or more, functional pores. These chips will deliver not only a $1,000 genome, but an ultrarapid genome in as little as 10 minutes.

Sequencing by Transcription using Single-molecule Field-effect Transistors
Kotseroglou, Theofilos
Eve Biomedical, Inc., Mountain View, California
R43 HG007871
$250,000 FY14, $500,000 (2 years)

Collaborator: Ken Shepard, Columbia University

The long-term goal of the proposed research is the design and construction of a DNA sequencing system that can sequence the whole human genome under $100 including sample preparation and with the cost of goods of the system well under $5,000. The system developed under the proposed research is at least an order of magnitude in cost better than the target of the solicitation while maintaining all performance metrics. In that respect it will break the barrier towards genomic medicine.
The overall system is based on sequencing DNA on a carbon nanotube CMOS array via motion of enzymes attached to each nanotube while transcribing or polymerizing the sample DNA. The enzyme motion at each transcription step translates to conductivity changes. When one of the nucleotides is reduced in concentration compared to the other three a pause will be detected in a series of faster steps. Iterating between all nucleotides will lead to decoding of all base positions with respect to each other. This assay has successfully sequenced in an alternate optical-based system, but with limited readlength at the 5kb order. The current system can achieve >50kb readlength.

Preliminary data shows that enzymes can be loaded at a single molecule fashion on a nanotube. Furthermore, other groups have shown that motion of enzymes can be monitored while on nanotubes. While this is a great start, Phase I work will investigate the most optimal attachment of enzymes to nanotubes and show sequencing using Eve Biomedical's assay; investigating accuracy, readlength and throughput limits of this sequencing approach. If successful, the main goal is to completely sequence a microbial DNA (E.coli) at the end of Phase I using a prototype system, and thus completely benchmark the proposed architecture.
Beyond Phase I, a low cost benchtop system will be constructed to perform sequencing of Whole Human Genome with the target cost defined as goal of this grant solicitation.

Massively Parallel Contiguity Mapping [THIS IS A RENEWAL]
Shendure, Jay Ashok
University of Washington, Seattle
R01 HG006283
$552,000 FY14, $1,686,000 (3 years)

Collaborators: Even Eicher, Phil Green, Jens Gundlach & Robert Waterston, U of Washington

Even as new technologies continue to drive down the cost of DNA sequencing, we are in critical need of equivalently powerful methods informing long-range contiguity to support both de novo genome assembly and haplotype-resolved genome resequencing. With funding through this program, we have explored diverse approaches for low-cost, massively parallel capture of contiguity information. Our progress is substantial, and includes the development of a method for in situ library construction and optical sequencing, a method in which we exploit 'contact probability maps' to produce the first chromosome-scale de novo mammalian genome assemblies based exclusively on short reads, and a method that combines contiguity preserving transposition and combinatorial indexing for accurate, megabase-scale haplotype-resolved human genome resequencing. We have also demonstrated the remarkable value of contiguity information through signature projects, including the first accurate, non-invasive prediction of a fetal genome, and the first haplotype-resolved sequencing of a cancer genome and epigenome. In this renewal application, we propose to narrow our focus to the advanced development of our two most promising approaches, namely contact probability mapping (Aim 1) and contiguity preserving transposition (Aim 2). We will then formally evaluate these methods for cost, performance and scalability, while also seeking to integrate them with one another and with emerging sequencing paradigms (Aim 3). Coupled with a modest drop in the per-base cost of short read DNA sequencing, these methods will enable chromosome-scale de novo assembly of large genomes as well as chromosome-scale haplotype-resolved human genome resequencing for about $1,000.

Single-stranded Sequencing using Microfluidic Reactors (SISSOR)
Zhang, Kun; Huang, Xiaohua
University Of California, San Diego
R01 HG007836
$919,000 FY14, $3,719,000 (4 years)

Collaborator: Vineet Bafna, U of California, San Diego

Recent advances in acquiring genome information quickly and inexpensively have transformed many aspects of biomedical and environmental research. Clinical sequencing applications have emerged, and are at the beginning of touching our daily life. While the previously thought unreachable goal of $1,000 genome has become a difficult-to-miss target within the next 2-3 years, major challenges still remain in terms of both accuracy and long-range continuity, both have direct implications in the clinical applicability of genome sequencing as well as many nonmedical applications.

In this project we will develop SISSOR (SIngle-Stranded Sequencing using micrOfluidic Reactors), with the goal of sequencing a mammalian-size genome at the consensus error rate of 10^-9 or lower, with a haplotype contig N50 of at least 10 Mb for $1,000. In addition, SCISSOR can start with one single cell, providing the capability of dissecting somatic mutations in heterogeneous tissues (such as cancers and brain), and extracting genome information de novo from difficult-to-culture organisms. Instead of proposing a completely new sequencing method, we chose to develop an integrative device focusing on the front-end preparation, which, in combination with the existing sequencing-by-synthesis chemistry, provides a highly realistic path to achieve the above goal within a 4-year development cycle.

Last updated: August 01, 2014