Published on October 31, 2013
High-Resolution Transcriptome Analysis: One Cell at a Time AMATA 2013 Queensland, Australia October 16, 2013 Jian-Bing Fan Senior Director, Scientific Research © 2013 Illumina, Inc. All rights reserved. Illumina, IlluminaDx, BaseSpace, BeadArray, BeadXpress, cBot, CSPro, DASL, DesignStudio, Eco, GAIIx, Genetic Energy, Genome Analyzer, GenomeStudio, GoldenGate, HiScan, HiSeq, Infinium, iSelect, MiSeq, Nextera, NuPCR, SeqMonitor, Solexa, TruSeq, TruSight, VeraCode, the pumpkin orange color, and the Genetic Energy streaming bases design are trademarks or registered trademarks of Illumina, Inc. All other brands and names contained herein are the property of their respective owners.
The Intuitive Beauty of RNA-Seq Data All junctions are covered uniformly in RNA-Seq 2
RNA-Seq has evolved in 5 years New methods: Stranded vs. Non-stranded – New Stranded RNA Prep kits New methods: Poly-A vs. Total RNA – RiboZero kits method of choice for rRNA reduction – Total RNA methods reveal ncRNAs and allow “RIN independent” preps Lower Input Levels – Standard input levels into all TruSeq RNA kits today is only 100 ng total RNA Methods for studying highly degraded RNA – Can sequence RNA from FFPE samples Single Cell RNA Sequencing Methods 3
Why single cells Cellular heterogeneity – What is a cell type? – How many cell types are there? Non-symptomatic somatic mutations – Cells at terminal differentiation contain “substantial” variations Development and cellular differentiation – Cell lineage – Reprogramming Metagenomes Circulating cells (liquid biopsy) – CTC – Stem cells – Fetal cells 4
Single cell transcriptional landscapes 5
Unbiased cell-type discovery 6 Sten Linnarsson, MBB, Mol Neuro
STRT (single-cell tagged reverse transcription) Based on template-switching at 5’ of mRNA Barcoding already at RT step, pooling before amplification Sequence ~50 bp from 5’ end of mRNA (= TSS) Highly multiplexed: 96 cells at a time 7 Sten Linnarsson, MBB, Mol Neuro
STRT (single-cell tagged reverse transcription) Reverse transcription, with TdT activity adding Cs Template switching, PCR Fragmentation, retaining 5’ end P2 adapter P1 adapter (library PCR) Finished library 8 Sten Linnarsson, MBB, Mol Neuro
Reproducibility ES cells Synthetic mRNA R2 = 0.97 R2 = 0.98 mRNA molecules (ES cell #2) Number of molecules (single well) 10000 1000 100 10 1 1 10 100 1000 10000 Number of molecules (single well) 9 Sten Linnarsson, MBB, Mol Neuro mRNA molecules (ES cell #1)
Distinguish cell types by clustering Embryonic stem cells 1. 96 individual cells, representing 3 different cell types were profiled. 2. Transcripts from each cell was tagged by a short 5-base code (during RT) and pooled from 96 cells for amplification and made into sequencing library for mRNA-Seq. 3. Cell neighborhood was calculated based on individual cell expression profiles. 4. The results is a set of clusters of mutually similar cells, which reflected the true identity of cells Neuroblastoma (Neuro2A) Embryonic fibroblasts (MEF) Sten Linnarsson, Karolinska Inst Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Islam S, Kjällquist U, Moliner A, Zajac P, Fan JB, Lönnerberg P, Linnarsson S. Genome Research. 2011. 10
Cell type specific expression pattern Gene expression mapped on the cellular landscape. The number of hits to each gene, normalized to transcripts per million (t.p.m.) sequencing reads is shown on a logarithmic color scale (inset, upper left). The left column shows housekeeping genes selected from a range of average t.p.m. levels. The middle column shows genes known as ES cell markers. The right column shows genes that were determined in this study to be preferentially expressed in Neuro2A. 11
Single-cell transcriptional profiling 12
Clontech SMARTer ultra low RNA kit for Illumina sequencing 13
Sequencing the transcriptome of a single cell Sort Cells Smart-Seq Amplification cDNA Cells RNA 1 0.01 ng 10 0.1ng 100 1 ng 1000 10 ng 10000 100 ng Good 14 Bad Illumina Library Prep NGS Sequencing
SMARTer™ technology overview Key aspects of SMARTer™ protocol: switching mechanism at 5’ end of RNA template Single tube, single enzyme cDNA synthesis SMARTer oligo provides increased template switching efficiency of RT Minimal handling of starting material lowers the probability of RNA degradation Enrichment for full-lengths cDNA transcripts 15
Workflow overview Total RNA SMART cDNA Synthesis Total RNA • • ~ 5 hour Automatable Spri purification Full-length ds cDNA Amplification SMART cDNA Synthesis Full-length ds cDNA Amplification Covaris Nextera Tagmentation End Repair PCR Amplification A tailing Adp ligation PCR Amplification 16 1 day DAY TWO • • < 2 hour Automatable Spri purification
Primary sequencing metrics 120.00 18000 16000 100.00 14000 80.00 12000 %unique reads 10000 60.00 8000 %mapped reads % rRNA gene 40.00 6000 4000 20.00 2000 0.00 0 10ng rep1 17 10ng 1ng rep1 1ng rep2 0.1ng rep2 rep1 0.1ng rep2 0.05ng 0.05ng 0.01ng 0.01ng rep1 rep2 rep1 rep2
Reproducibility with various amounts of input RNA 10 ng 1 ng 0.1 ng Scatter plots comparing gene counts (i.e., log2 RPKM values) for replicate samples prepared using 10 ng, 1 ng, and 0.1 ng of mouse brain total RNA Input levels represent the amount of RNA obtained from ~500, 50, and 5 cells, respectively With decreased amount of input reproducibility is typically decreased 18
Base Coverage Sequencing coverage of SMARTer ultra low library % distance from 5’ 724 genes analyzed for average coverage across the entire length of the transcripts The graphs show consistent results between the 1 ng, 0.1 ng, 0.5 ng and 0.01 ng input amount of mouse brain total RNA 19
Accuracy of SMARTer ultra low compared to Taqman MAQC UHR/Brain 1ng Total RNA 0.1ng Total RNA 10 5 0 -10 -10 -5 0 5 10 Log2 sequencing count ratio (brain vs UHR) Number of genes retained: 705 Correlation (R): 0.942 Slope: 0.913 20 -5 Log2 qPCR ratio (brain vs UHR) 5 0 -5 -10 Log2 qPCR ratio (brain vs UHR) 10 MAQC UHR/Brain -10 -5 0 5 10 Log2 sequencing count ratio (brain vs UHR) Number of genes retained: 581 Correlation (R): 0.856 Slope: 0.754
Performance summary Sensitive cDNA synthesis technology combined with Illumina nextgeneration sequencing Single-tube protocol, robust library generation starting from picogram quantities of total RNA High mapping rate, wide dynamic range, accurate gene quantification, and uniform transcript coverage The SMARTer kit has been used and validated by more than 100 labs around the world Fluidigm C1 Single-Cell Autoprep system has been customized for SMARTer assay 21
Example 1: Gene-expression “landscape” of hematopoietic stem cells (HSCs) 22
Transcriptional ‘architecture’ of the first steps of the human hematopoietic hierarchy ‘Distances’ between hematopoietic populations, as measured by difference in expression in the downstream population relative to that in its progenitor (over twofold difference; FDR, <0.05), overlaid on the present hierarchical model of human hematopoietic differentiation. John Dick, University of Toronto The transcriptional architecture of early human hematopoiesis identifies multilevel control of lymphoid commitment. Elisa Laurenti, Sergei Doulatov, Sasan Zandi, Ian Plumb, Jing Chen, Craig April, Jian-Bing Fan & John E Dick. Nature Immunology. 2013. 23
Example 2: Single-cell transcriptome analysis of mammalian cell cycle 24
Single-cell transcriptomes of different cell cycle stages 40000 G1 35000 F lu o res c en c e (d R ) 30000 25000 20000 15000 10000 5000 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 -5000 C ycle Number Li et al, Biotechnol. Adv. 2013 25 G2 E xpression of C dt1 and Geminin John Zhong, USC
Molecular map of cell cycle Single-cell transcriptomes can be organized by similarity into a molecular map to re-constructs stepwise cell cycle events at the molecular level 26 John Zhong, USC
Example 3: NIH single cell analysis program (SCAP) 27
The UCSD (PI)/Harvard/Scripps/Illumina team Samples Data Methods George Church Jerold Chun TSRI Harvard Kun Zhang Jian-Bing Fan UCSD Illumina Wei Wang Mostafa Ronaghi 29
NIH Single Cell Analysis Program Three centers funded from the National Institutes of Health's Common Fund, through its Single Cell Analysis Program (SCAP). – UCSD, USC and UPenn Single-cell sequencing and in-situ mapping of mRNA transcripts in human brains: – Generating total-RNAseq data on 10,000 microdissected single cells or flow-sorted single nuclei from Human Cortex and to create a 3D transcriptional map of the human brain. – Development and optimization of an in-situ RNA sequencing technology. – In-situ mapping of ~500 transcripts in 36 cortex sections, and integration with 10,000 sets of totalRNAseq data. – Includes UCSD (Kun Zhang (PI), Wei Wang), Scripps (Jerold Chun), Harvard (George Church), Illumina (Jian-Bing Fan, Mostafa Ronaghi) . 30
Approach Sample preparation (TSRI). – Microdissection of neurons and glia. – Flow sorting of neuronal and nonneuronal nuclei. Single-cell total-RNAseq (Illumina & UCSD). – – – – RNA transcripts +/- A-tails. Long and short transcripts. Strand-specificity. Batch processing in 96-well plates. RNA in situ sequencing (UCSD, Harvard & Illumina). – In-situ conversion of single RNA molecules into DNA nanoballs (rolonies). – In-situ decoding and counting by hybridization or sequencing on automated confocal microscope with customized fluidic devices. 31
Single-cell transcriptome sequencing methods Surani/LifeTech: Full length mRNA (Tang et al. 2009) STRT: mRNA 5’-end sequencing (Islam et al. 2011) CEL-seq: mRNA 3’-end sequencing (Hashimshony et al. 2012) Smart-seq: Full-length mRNA (Ramskold et al. 2012) Smart-seq2: Full-length mRNA (Picelli et al. 2013) Toto-RNAseq (UCSD/Illumina, being developed) – – – – 32 Full length Strand specific mRNAs and ncRNAs High throughput
Context is important Murray et al. Nat. Method, 2008 33
RNA FISH Barcoded RNA FISH + STORM RNA FISH + epifluorescent imaging Raj et al. Nat. Methods, 2008 Lubeck et al. Nat. Methods, 2012 34
In situ sequencing for RNA analysis in preserved tissue and cells Ke and Nilsson et al. Nat. Method, 2013 35
Fluorescent in situ sequencing (FISSEQ) 36 Jay Lee and George Church, Harvard
Two sequencing chemistries 37 Jay Lee and George Church, Harvard
Characterization of the 3D RNA-Seq library The system was able to sequence the whole transcriptome in situ in 3D, mapping over 100,000 reads and 6000 clusters, detecting mRNA, ncRNA, and antisenseRNA which can then strongly indicate the cell type. 38 Jay Lee and George Church, Harvard
Single cell sequencing applications Cancer – Early diagnosis of cancer Circulating tumor cells may be present before … Limited clinical samples and early stage cancers Heterogeneity in tumors – Change in clonal population post-treatment Brain transcriptome – 3-D transcriptome map of a brain at high resolution Human cell lineage tree in health and disease (European Commission) Embryo to Adult – Accumulation of somatic mutations with cell division – Stem cell differentiation – Cellular origin mapping Fetal cells Single cell microbes (metagenomes) 39
Summary Single cell transcriptomes provided comprehensive molecular characterization of individual cells and revealed unique cell types/stages; discovered cell types correspond to marker-based cell types Systematic whole-organism cell mapping is feasible – Millions of single-cell transcriptomes needed Future technology development and integration – Isolation, identification & characterization of cells from all organs and systems in health, disease, & post-mortem – Molecular characterization of individual cells (e.g. single cell RNA-Seq) – Platforms: Next-gen sequencing, microfluidics, DNA arrays, & other analyses of individual cells – Three-dimensional subcellular transcriptome sequencing in situ – Real-time measurement – Computer Science & Systems: Extremely large-scale data capture, analysis, coalescence & management tools, methods & algorithms, cell lineage analysis & reconstruction algorithms, interactive data analyses & presentation. – Mathematics & Statistics 40
Acknowledgements STRT technology development Sten Linnarsson (Karolinska Inst) Saiful Islam (Karolinska Inst) SMART kit development Shujun Luo (Illumina) Gary Schroth (Illumina) Richard Sandberg (Ludwig Institute for Cancer Research) Daniel Ramskold (Ludwig Institute for Cancer Research) Andrew Farmer (Clontech) HSC and cell cycle projects John Dick (Ontario Cancer Institute, University of Toronto) Elisa Laurenti (Ontario Cancer Institute) John Zhong (University of Southern California) NIH SCAP Kun Zhang (PI; UCSD) Wei Wang (UCSD) Jerold Chun (Scripps) Jian-Bing Fan (Illumina) Mostafa Ronaghi (Illumina) Jay Lee (Harvard) George Church (Harvard) 41
Thank You 42
Fluorescent in situ sequencing (FISSEQ) 43 Jay Lee and George Church, Harvard
... One Cell at a Time Jian Bing-Fan ... 3:50 Single Cell Transcriptome Analysis: ... 2014 Symposium: Genomics of the Single Cell Diego.
Transcriptome Analysis of Single Cells; ... Single-cell transcriptomics ... Different single-cell RNA-seq protocols have been introduced and are reviewed ...
Cancer genomics: one cell at a time. ... to generate high-resolution (54 kb) ... Development and applications of single-cell transcriptome analysis.
... Single cell transcriptome ... Deconvolution of signaling function from single-cell time ... cDNA library generation for transcriptome analysis ...
mRNA-seq whole-transcriptome analysis of a single cell. ... Comprehensive comparative analysis of strand-specific RNA sequencing methods. ... Jian-Bing Fan;
The development of next-generation sequencing ... Edited By Jian-Bing Fan, ... Beyond static analysis, real-time analysis will be necessary to fully ...
Highly multiplexed and strand-specific single-cell RNA 5 ... be directly accessed using single-cell transcriptome analysis of ... Jian-Bing Fan;
Rice transcriptome analysis reveals an ... the complexity of the rice transcriptome. ... Step One Plus Real Time PCR ...