Brendel Group Presentation: 4 Mar 2013

50 %
50 %
Information about Brendel Group Presentation: 4 Mar 2013
Technology

Published on March 4, 2014

Author: danielstandage

Source: slideshare.net

Description

A review of the 3 transcript reconstruction modes available in the Trinity RNA-Seq package.

Transcript reconstruction algorithms available in the Trinity RNA-Seq package Daniel Standage Brendel Group, Indiana University 4 Mar 2014 Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 1 / 24

Introduction RNA-Seq RNA-Seq Examination of transcriptomes deep effective affordable Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 2 / 24

Introduction RNA-Seq RNA-Seq High throughput comes at the expense of contiguity. Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 3 / 24

Introduction RNA-Seq RNA-Seq High throughput comes at the expense of contiguity...well, at least for now. Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 4 / 24

Introduction Assembly with Trinity Transcriptome assembly In the absence of full-length transcript sequences, reconstruct full-length sequences from fragments. Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 5 / 24

Introduction Assembly with Trinity Trinity RNA-Seq Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 6 / 24

Introduction Assembly with Trinity Trinity RNA-Seq Now with 3 transcript reconstruction modes! Butterfly (default) --PasaFly --CuffFly Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 7 / 24

Introduction Assembly with Trinity Review outline Trinity algorithm PASA algorithm Cufflinks algorithm Discussion Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 8 / 24

Trinity Inchworm Step 1: Inchworm Assemble unique contigs representing transcript subsequences. Often produces dominant isoform in full length, and then just unique portions of alternative isoforms. Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 9 / 24

Trinity Inchworm Inchworm procedure 1 Create dictionary of k-mers (k = 25) 2 Remove k-mers containing probable errors (based on coverage?) 3 Selects highest occurring k-mer 4 Build contig by extending k-mer (find highest occurring k-mer with k − 1 bp overlap, extend 1 bp), remove k-mer from dictionary 5 Repeat previous step until the contig cannot be extended further, report contig 6 Repeat steps 3-5 until all k-mers are exhausted Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 10 / 24

Trinity Chrysalis Step 2: Chrysalis Group Inchworm contigs, construct de Bruijn graph for each cluster. Each connected component of the graph corresponds to one or more genes with shared sequence. Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 11 / 24

Trinity Chrysalis Chrysalis procedure 1 Group contigs if they share perfect overlap of k − 1 bp (with reads supporting the overlap) 2 Build de Bruijn graph with k − 1 word size for nodes, k for edges; edges weighted by supporting reads 3 Assign each read to component with which it shares the largest number of k-mers Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 12 / 24

Trinity Butterfly Step 3: Butterfly Traverse read-supported paths in each subgraph, enumerate plausible sequences. Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 13 / 24

Trinity Butterfly Butterfly procedure 1 2 Graph simplification: merge consecutive nodes in linear paths, pruning minor deviations Plausible path scoring: identify paths in graph with read support Initialize DP table with source nodes (no incoming edges) Fill in table by extending path prefixes by one node Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 14 / 24

PASA PASA Program to Assemble Spliced Alignments designed for ESTs and FL-cDNAs (pre-NGS era) works on sequence alignments computes consensus spliced alignments Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 15 / 24

PASA PASA algorithm Input: a set of spliced cDNA alignments A Output: for each alignment a ∈ A, the largest assembly containing a 1 Sort alignments 2 Test overlapping alignments for compatibility 3 Build DP table, backtrace to find maximal assembly A∗ 4 If ∃a ∈ A∗ , build reciprocal DP table, trace to enumerate additional / assemblies Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 16 / 24

PASA PASA algorithm Recurrences La = max{Ca , Lb + Ca/b } b Ra = max{Ca , Rb + Ca/b } b La , Ra : maximum number of cDNAs in an assembly that contains alignment a, starting from left and right (respectively) Ca : number of a-compatible alignments in the span of a Ca/b : number of a-compatible alignments in the span of a but not in the span of b Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 17 / 24

PASA PASA algorithm Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 18 / 24

Cufflinks Cufflinks designed for short transcript reads (NGS era) works on read alignments (mappings) identifies fewest number of transcripts that “explain” the read mappings Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 19 / 24

Cufflinks Cufflinks algorithm Input: overlap graph G of mapped reads Output: a minimal path cover of G , with each path corresponding to a single assembled transcript 1 Alignments divided into non-overlapping loci 2 Erroneous read alignments removed 3 Compute transitive reduction of G , G 4 5 Construct bipartite graph G ∗ from transitive closure of G ,with edges weighted by coverage to “phase” distant exons by their coverage Compute minimum-cost maximal matching in G ∗ , which corresponds to minimum path cover of G Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 20 / 24

Discussion Three different construction approaches Butterfly: enumerate all plausible transcripts with minimal read support PASA: for each alignment, find largest assembly (transcript) containing the alignment CuffLinks: find minimal assembl(y|ies) that explain the data, using read coverage to “phase” distant exons Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 21 / 24

Discussion Next time: comparison of 8 Trinity assemblies Four assembly settings Butterfly --PasaFly --CuffFly Butterfly, --min kmer cov 2 Two input data sets Groomed data Groomed data with digital normalization Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 22 / 24

Discussion Next time: comparison of 8 Trinity assemblies Hypotheses (transcripts per assembly) Butterfly > PasaFly > CuffFly Diginorm > No diginorm Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 23 / 24

Discussion Thank you! Daniel Standage (Brendel Group @ IU) Trinity Assembly 4 Mar 2014 24 / 24

Add a comment

Related presentations

Related pages

Facebook - Log In or Sign Up

Create an account or log into Facebook. Connect with friends, family and other people you know. Share photos and videos, send messages and get updates.
Read more

How to add a soundtrack to a PowerPoint 2013 presentation ...

How to add a soundtrack to a PowerPoint 2013 presentation ... Published on Mar 13, 2013. ... How To... Add Music to a Presentation in PowerPoint ...
Read more

Microsoft PowerPoint – Wikipedia

Microsoft PowerPoint ist ein Computerprogramm, mit dem sich interaktive Präsentationen unter Windows und Mac OS erstellen lassen.
Read more

MEI

MEI is one of the world's leading manufacturers of unattended transaction systems. Choose your region to continue.
Read more

Mare Forum – Maritime Conferences

Home; Conferences 2015. 2nd Mare Forum Dubai 2015; 1st Mare Forum Singapore 2015; 7th Mare Forum Indonesia 2015; 2nd Mare Forum Oil and Gas Europe 2015
Read more

Sanofi - Investors - Ensuring sustainable growth

Events 2013. Events 2012. ... Relations with patient advocates & groups. ... Sanofi hosted a Thematic Conference Call on Diabetes in connection with the ...
Read more

Mobile Testing - Fall 2013 - Part 3/10 - YouTube

... always look for the Latest Video Presentation. ... http://www.linkedin.com/groups?home ... Mobile Testing - Fall 2013 - Part 4/10 ...
Read more

Del Mar Thoroughbred Club

The Del Mar Thoroughbred Club, ... Group Sales General Group Sales Info; ... 4: Glen Hill Farm: 2: 2: 2: $274,580: 5:
Read more