The Power of Sequencing Single Cell Genomes
Roseanne F. Zhao, Ph.D.
NIH Medical Scientist Training Program Track 3 Scholar
As the fundamental unit of life, each cell contains a complete copy of an organism's genome, which can undergo dynamic DNA mutations as the cell grows and divides. Studying the genomes of single cells is important for tracking global patterns of change across hundreds or thousands of individual cells, and will help to elucidate changes that occur in DNA over time. Among other things, this will allow scientists to gain insight into the development of mutations and diversity in the genome, as well as to track genetic changes associated with the origin and progression of different diseases.
In this issue of Genome Advance, we focus on a novel technique that allows researchers to accurately sequence a single cell.
Conventional whole genome sequencing technologies use DNA extracted from large numbers of cells to acquire enough starting material for sequencing. The normal consensus sequence is obtained by aligning, or piecing together, many short bits (known as reads) of DNA sequence. The identity at each position of the genome is determined by frequency so that only the most frequently found base pair, or letter, at each position appears in the final sequence. However, because a significant amount of difference can be averaged out, this approach misses out on much of the variation between cells in a population.
To address this problem, researchers have been working to optimize technologies that will allow them to sequence single cells. Since a single cell contains only a tiny amount of DNA, as little as a few picograms (a picogram is equal to one-trillionth of a gram), a considerable technical challenge has been determining the best method for DNA amplification — making multiple copies of the DNA — to a sufficient quantity necessary for sequencing.
Existing methods use non-linear or exponential amplification. In other words, in every cycle of the replication process, DNA copied from the original can serve as a template for making more copies. However, if a copying error is made during an early replication cycle or if some parts of the genome are amplified more efficiently than others, then those copies will be over-represented in the final sample. These techniques, including multiple displacement amplification and polymerase chain reaction, are commonly used to amplify DNA from a single cell for sequencing.
The biases inherent in these techniques can lead to uneven distribution of sequencing reads across the genome and a lower percentage of the genome covered by the final sequence. As a result, these methods are often not sensitive enough to detect changes such as single nucleotide polymorphisms (SNPs) — variations in a single letter of the DNA sequence — and copy number variants (CNVs) — differences in the number of larger segments of DNA that are commonly deleted or duplicated.
A new amplification method, developed by Professor Xiaoliang Sunney Xie's group at Harvard University and reported in the December 21, 2012, issue of Science, reduces this bias. Known as multiple annealing and looping-based amplification cycles (MALBAC), this technique employs special primers to initiate the amplification process, allowing the DNA to be copied in a non-exponential way. (Primers are short strands of complementary DNA molecules that bind to the DNA template and act as a starting point for the synthesis of new DNA.)
The improved accuracy and quality of single cell sequencing with MALBAC represents a significant advance in both basic science and individualized medicine. This technology can aid scientists in understanding how specialized and genetically different cells function together to form complex systems like brain networks. It may also be valuable for tracking cancers, screening prenatal samples or testing forensic specimens, when just one cell is available for analysis.
Through the use of special primers in MALBAC, the genomic DNA is amplified to form loops that can't act as templates for further replication. Due to the formation of these amplified loops, only the initial genomic DNA can be copied in each cycle, resulting in a near linear amplification of the original DNA. This pre-amplification process reduces much of the sequencing bias present in existing methods. It also promotes more uniform copying of the whole genome, giving sequences in which up to 93 percent of the genome is covered by at least one sequence read and by 25 reads on average. In comparison, the authors found that sequencing with multiple displacement amplification, an existing method commonly used for amplifying DNA from single cells, resulted in only 72 percent genome coverage under the same conditions.
The uniform genomic amplification using MALBAC allowed the authors to accurately detect the generation and accumulation of SNPs and CNVs in a colon cancer cell line, something that existing methods could not reliably do. Such data may have implications for understanding tumor evolution, tracking circulating tumor cells and determining responses to drug treatments for individual patients in a clinical setting.
This technique can also be used to sequence other types of cells. In a companion paper published in collaboration with Professor Ruiqiang Li and colleagues at Peking University in Beijing, China, the authors applied MALBAC to sequence the genomes of 99 sperm cells from a healthy donor.
Unlike other cells in the human body, egg and sperm cells contain only a single set of chromosomes; the joining of a sperm and egg results in a fertilized cell with the normal 23 pairs of chromosomes that comprise the human genome. Genetic diversity is introduced to egg and sperm cells through crossover events that mix the DNA between each pair of chromosomes, before the chromosomes are separated (segregated) and distributed to different cells. Abnormalities generated during these crossover events or during chromosome segregation are the most common source of birth defects and miscarriage, and can also be associated with male infertility.
The authors found that DNA abnormalities such as SNPs and CNVs can be found even in the sperm of a healthy donor. By studying SNPs in single sperm cells, the authors were able to map the positions of crossover events or chromosome segregation errors at very high resolution. This study, and others like it, has implications for understanding genome instability and fertility.
Other exciting use of the new technology include studying the dynamic nature of microbial populations such as the human microbiota — a population of bacteria, fungi, protozoa, and viruses that live on and inside the human body, and contribute to maintaining human health. (For more information, see: The Human Microbiome Project.) The study of bacteria that thrive in other environments will aid our exploration into the genomic variation in different microbes, and may help in the development of renewable biofuels and other environmentally sustainable technologies.
This is just the start. As techniques continues to be optimized, scientists will be able to apply single cell sequencing to a variety of problems, including dynamic genomic variations in various species, as well as on the environment and human health.
Read the articles:
Zong C, Lu S, Chapman AR, Xie XS. Genome-Wide Detection of Single-Nucleotide and Copy-Number Variations of a Single Human Cells. Science, 338(6114):1622-6. 2013. [PubMed]
Lu S, Zong C, Fan W, Yang M, Li J, Chapman AR, Zhu P, Hu X, Xu L, Yan L, Bai F, Qiao J, Tang F, Li R, Xie XS. Probing Meiotic Recombination and Aneuploidy of Single Sperm Cells by Whole-Genome Sequencing. Science, 338(6114):1627-30. [PubMed]
Posted: January 22, 2013