VIZBI 2014 - Visualizing Genomic Variation

Information about VIZBI 2014 - Visualizing Genomic Variation

Published on March 7, 2014

Author: jandot



This talk was given at the VizBi 2014 conference. See

Visualizing Genomic Variation Prof Jan Aerts Faculty of Engineering - ESAT/STADIUS iMinds Medical ICT Department KU Leuven !

What is genomic variation?

transitions transversions “copy number variation” Aerts & Tyler-Smith, In: Encyclopedia of Life Sciences, 2009

Effects of variation on phenotype • change in protein abundance • level of transcription or translation (loss/gain) • stability • change in protein structure (partly deleted, fusion genes, …)

What are we interested in? • multiple samples • • • show all affected genes (or functional units) cluster individuals functional effect of structural variation • • • gene-centric instead of positionally ordered: coordinate-free view high-level annotations (pathways, GO-terms) uncertainty (statistical & positional) and underlying evidence

DNA sequencing QC read mapping variant calling variant filtering what is effect of variant? check signal QC

Single Nucleotide Polymorphisms

General approach: reference-based

UCSC Ensembl

Variant View
 sequence variants in gene context Ferstay et al, IEEE InfoVis, 2013

Integrative Genome Viewer (IGV)

Sequence logo

Sequence Diversity Diagram

Structural Variation

dotplot Pevzner & Tessler, Genome Research, 2003

read depth information: arrayCGH and next-generation sequencing Xie & Tammi, BMC Bioinformatics, 2009

next-generation sequencing: read-pair information Medvedev, Nature Methods, 2009 Stephens et al, Cell, 2011

Integrate read-depth and read-pair information Stephens et al, Cell, 2010 Meander Pavlopoulos et al, Nucleic Acids Research, 2013

From data generation to data interpretation: understanding the effect of structural variation

linearity of reference chromosome broken by structural variation, but still using the reference for comparison ! ! UCSC Genome Browser => domain expert needs to try and “wrap his head around” the data => need to lessen the cognitive load in interpretation: change a cognitive task into a perceptual one

Nielsen & Wong, Nat Methods, 2012

represent the chromosome as it is in vivo (=~ FISH) Feuk, Nature Reviews Genetics, 2006 reconstruct rearranged chromosome based on graph structure of segments

breakpoint graph Pevzner & Tessler, Genome Research, 2003

focus on functional impact - Pipit Sakai et al, submitted


Challenges • visual and interaction scalability • • deep sequencing => very high depth per track • high-dimensional data: many tracks (n=98!) • • genome size: HSA1 = 240Mb = 240,000 screens at 1pixel/bp = 72km compare multiple samples computational scalability • how to compute fast enough to make interactivity possible? (e.g. switching between data resolutions)

Thank you • Authors of papers mentioned • Bioinformatics/Visual Analytics Leuven • Ryo Sakai • Raf Winand • Thomas Boogaerts • Toni Verbeiren • Georgios Pavlopoulos • Data Visualization Lab ( • Erik Duval • Andrew Vande Moere 33


