Transcriptome Fact Sheet

National Human Genome Research Institute

National Institutes of Health
U.S. Department of Health and Human Services


Enlarge the illustration PDF file
Enlarge the illustration

To view the PDF illustration on this page, you will need Adobe Reader. Download Adobe Reader

What is a transcriptome?

The genome is made up of deoxyribonucleic acid (DNA), a long, winding molecule that contains the instructions needed to build and maintain cells. For these instructions to be carried out, DNA must be transcribed into corresponding molecules of ribonucleic acid (RNA), referred to as transcripts.

A transcriptome is a collection of all the transcripts present in a given cell.

There are various kinds of RNA. The major type, called messenger RNA (mRNA), plays a vital role in making proteins. In this process, mRNA transcribed from genes, which include the protein-coding parts of the genome, is delivered to ribosomes, which are molecular machines located in the cell's cytoplasm. The ribosomes read, or "translate," the sequence of the chemical letters in mRNA to assemble building blocks called amino acids into proteins. Each mRNA is transcribed from a gene and then translated into a specific protein..

DNA can also be transcribed into other types of RNA that do not code for proteins. Such transcripts may serve to influence cell structure and to regulate genes.

Top of page

Is a transcriptome the same as a genome?

No, a transcriptome is different from a genome, which is the entire DNA sequence of an organism. A transcriptome represents the very small percentage of the genome - less than 5 percent in humans - that is transcribed into RNA molecules. A gene may produce many different types of mRNA molecules, so a transcriptome is much more complex than the genome that encodes it.

Top of page

What can a transcriptome tell us?

The sequence of an RNA mirrors the sequence of the DNA from which it was transcribed. Consequently, by analyzing the entire collection of RNAs (transcriptome) in a cell, researchers can determine when and where each gene is turned on or off in the cells and tissues of an organism.

Depending on the technique used, it is often possible to count the number of transcripts to determine the amount of gene activity, also called gene expression, in a certain cell or tissue type.

In humans and other multi-cell creatures, nearly every cell contains the same genes, but different cells show different patterns of gene expression. These differences are responsible for the many different types of properties and behaviors seen among various cells and tissues, both in health and disease.

By collecting and comparing transcriptomes of different types of cells, researchers can gain a deeper understanding of what constitutes a specific cell type, how that type of cell normally functions, and how changes in the normal level of gene activity may reflect or contribute to disease. Furthermore, by aligning the transcriptome of each cell type to the genome, it is possible to generate a comprehensive, genome-wide picture of what genes are active in which cells.

Top of page

How can transcriptome data be used to explore gene function?

The function of most genes is not yet known. A search of a transcriptome database can give researchers a list of all the tissues in which a gene is expressed, providing clues to its possible function.

For example, if the transcriptome database shows a gene's expression levels are dramatically higher in cancer cells than in healthy cells, it is possible that the unknown gene may play a role in cell growth. Or if a gene is expressed in fat tissue, but not in bone or muscle tissue, the unknown gene may be involved in fat storage or metabolism. In both instances, the transcriptome data gives researchers a good place to start in the search for a new gene's function.

The National Human Genome Research Institute (NHGRI), which is part of the National Institutes of Health (NIH), has participated in two projects that created transcriptome resources for use by researchers around the world. Those projects were the Mammalian Gene Collection initiative and the Mouse Transcriptome Project.

Top of page

What was the Mammalian Gene Collection?

The Mammalian Gene Collection initiative built a library of human, mouse and rat mRNA sequences in a form called complementary DNA (cDNA) clones. The project, which was co-led by NHGRI and the National Cancer Institute, also part of NIH, has provided at least one cDNA clone for most known human and mouse genes.

Researchers can view the cDNA sequence data in a free, public database located at the Mammalian Gene Collection []. They can also order copies of these cDNA clones and then insert them into bacterial or mammalian cells, causing the new host to synthesize the proteins encoded by that particular gene transcript. This enables researchers to study the protein's properties in greater detail, and to examine the effects that the protein and mutant versions of the protein may have on various cell types. It also makes it possible to obtain large amounts of the gene, the transcript or the protein to use in biochemical studies.

Top of page

What was the Mouse Transcriptome Project?

The Mouse Transcriptome Project was an NIH-supported initiative that generated a free, public database of gene transcripts for many mouse tissues. These tissue-specific expression data, which are mapped to the mouse genome, are available in a searchable format in the Mouse Reference Transcriptome Database [].

The mouse was chosen for this effort because its genome has been sequenced, because its tissues could be obtained under rigorous quality control conditions, and because of its importance as a model for the study of human biology.

Top of page

Last Reviewed: June 12, 2012