Analyzing Expression Levels
As an example of the quantitative ability of the assay, we have used mRNA-Seq to analyze expression levels of all genes in two heavily studied samples that were part of the Microarray Quality Control (MAQC) project. These two samples, the universal human reference RNA and a mixed whole human brain sample, have been intensely studied using all of the major microarray platforms as well as quantitative PCR.
Figure 1 compares the fold-change levels calculated using the digital counts from the mRNA-Seq assay with the same results from quantitative PCR assays for about 750 genes. The overall correlation of the data between these two different assays is a confirmation of the accuracy and range of the mRNA-Seq assay.
In addition to measuring digital gene-expression levels for all transcripts, the mRNA-Seq assay can be used to study transcript structure in genes with many reads spread out along the length of the original mRNA molecule. These reads provide information about alternative splicing, since many of the reads span exon-exon junctions formed during normal mRNA processing. Because they occur only as a result of mRNA splicing, the sequences created at splice junctions usually do not fully align back to the genome. Instead, these reads provide specific evidence of the order of exons within a given transcript.
Figure 2 is a visualization of junction reads that shows how they can be used to study alternative splicing. This figure is a screenshot taken from the GenomeStudio™ Software suite developed by Illumina to help users analyze and interpret the data from the mRNA-Seq Assay. The software displays tables of quantitative data in the form of SNPs and digital counts associated with known genes, exons, and splice junctions. In addition, the visualization tools can be used to help annotate new transcripts and understand the complexities of alternative splicing.
The mRNA-Seq Assay can be used to study polymorphisms in the transcriptome, and as a tool to gain insight into the genetics of transcription. The GenomeStudio software creates reports of all positions in the sample where the consensus base sequence called is different than the reference human genome.
The software automatically creates an Allele Table of putative coding SNPs from every mRNA-Seq Assay. Figure 3 shows the results of analyzing SNPs in the MAQC human brain sample, a mixture of whole-brain RNA from 23 individuals. The figure shows a clear example of a novel SNP that occurs in the coding region of a known gene.
The software color-codes sequence differences, and represents newly discovered SNPs with red characters. After all SNPs have been characterized across each transcript, it is straightforward to use these differences to study allele-specific expression patterns for each gene across individual samples. This method will eventually be used to improve our understanding of the relationship between genetics and gene expression.
mRNA-Seq is a product that combines all the benefits of microarrays, quantitative PCR, and EST sequencing into one powerful new assay. It can be used to study the transcriptome of any organism and is not biased by what is or is not known about any genome. The assay offers a combination of accurate and precise quantification coupled with hypothesis-free, open-ended discovery—all in one experiment.