advertisement

Stamatakis18 3 04

50 %
50 %
advertisement
Information about Stamatakis18 3 04
Entertainment

Published on November 29, 2007

Author: Yuan

Source: authorstream.com

advertisement

Parallel & Distributed Systems and Algorithms for Inference of Large Phylogenetic Trees with Maximum Likelihood :  Parallel & Distributed Systems and Algorithms for Inference of Large Phylogenetic Trees with Maximum Likelihood Alexandros Stamatakis LRR TU München Contact: stamatak@cs.tum.edu Outline:  Outline Motivation Introduction to phylogenetic tree inference Statistical inference methods Maximum Likelihood & associated problems Solutions: 2 simple heuristics parallel & distributed implementation Results Conclusion Availability & Future Work Motivation: Towards a „Tree of Life“:  Motivation: Towards a „Tree of Life“ 30.000 organisms available, current trees <= 1000 Where we are: Motivation: Towards a „Tree of Life“:  Motivation: Towards a „Tree of Life“ 30.000 organisms available, current trees <= 1000 Where we want to get: Phylogenetic Tree Inference:  Phylogenetic Tree Inference Input: „good“ multiple alignment of a distinguished, highly conserved part of DNA sequences Output: unrooted binary tree with the sequences at its leaves (all nodes: either degree 1 or 3) Various methods for phylogenetic tree inference Differ in computational complexity and quality of trees Most accurate methods: Maximum Likelihood Method (ML) and Bayesian Phylogenetic Inference: + most sound and flexible methods + other methods not suited for large/complex trees -- most computationally intensive methods ML and Bayesian methods:  ML and Bayesian methods T.Williams et al (March 2003) comparative analysis with simulated data shows: MrBayes is best program Guidon et al (May 2003) PHYML very fast & accurate ML program for real & simulated data: faster than MrBayes ML (PHYML, RAxML2): + Significantly faster than MrBayes + Reference/starting trees for bayesian methods -- Less powerful statistical model Bayesian Inference (MrBayes): + Powerful statistical model -- MCMC convergence problem Memory requirements for 1000/10000-taxon alignment: RAxML: 200MB/750MB PHYML: 900MB/8.8GB MrBayes: 1150MB/unknown MCMC Convergence Problem:  MCMC Convergence Problem What does ML compute?:  What does ML compute? Maximum Likelihood calculates: Topologies Branch lengths v[i] Likelihood of the tree Goal: Find tree topology wich maximizes likelihood Problem I: Number of possible topologies is exponential in n Problem II: Computation of likelihood value + branch length optimization is expensive Solution: Algorithmic Optimizations (previous work) + New heuristics + HPC S1 S2 S3 S4 S5 v1 v2 v3 v4 v5 v6 v7 New Heuristics for RAxML:  New Heuristics for RAxML Two common methods to build a tree: Progressive addition of organisms e.g. stepwise addition algorithm Use a (random, simple) starting tree containing all organisms and optimize likelihood by application of topological changes RAxML (Randomized Axelerated Maximum Likelihood) computes parsimony starting tree with dnapars -> fast and relatively „good“ initial likelihood dnapars uses stepwise addition -> randomized sequence input order to obtain distinct starting trees Optimize starting tree by application of rearrangements Accelerate rearrangements by two simple ideas Subtree Rearrangements:  Subtree Rearrangements Subtree Rearrangements:  Subtree Rearrangements ST5 ST2 ST6 ST4 ST3 ST1 Subtree Rearrangements:  Subtree Rearrangements ST5 ST2 ST6 ST4 ST3 ST1 +1 Subtree Rearrangements:  Subtree Rearrangements ST5 ST2 ST6 ST4 ST3 ST1 +1 Subtree Rearrangements:  Subtree Rearrangements ST5 ST2 ST6 ST4 ST3 ST1 +1 Subtree Rearrangements:  Subtree Rearrangements ST5 ST2 ST6 ST4 ST3 ST1 +1 Subtree Rearrangements:  Subtree Rearrangements ST5 ST2 ST6 ST4 ST3 ST1 +2 Subtree Rearrangements:  Subtree Rearrangements ST5 ST2 ST6 ST4 ST3 ST1 +2 Subtree Rearrangements:  Subtree Rearrangements ST5 ST2 ST6 ST4 ST3 ST1 Optimize all branches Subtree Rearrangements:  Subtree Rearrangements ST5 ST2 ST6 ST4 ST3 ST1 Need to optimize all branches ? Idea 1: Local Optimization of Branch Length:  Idea 1: Local Optimization of Branch Length ST5 ST2 ST6 ST4 ST3 ST1 Idea 1: Local Optimization of Branch Length:  Idea 1: Local Optimization of Branch Length ST5 ST2 ST6 ST4 ST3 ST1 Why is Idea 1 useful?:  Why is Idea 1 useful? Local optimization of branch lengths: Update less likelihood vectors -> significantly faster Allows higher rearrangement settings -> better trees Likelihood depends strongly on topology Fast exploration of large number of topologies Straight-forward parallelization Store best 20 trees from each rearrangement step Branch length optimization of best 20 trees only Experimental results justify this mechanism Idea 2:Subsequent Application of Topological Changes :  Idea 2:Subsequent Application of Topological Changes Idea 2:Subsequent Application of Topological Changes :  Idea 2:Subsequent Application of Topological Changes ST3 Idea 2:Subsequent Application of Topological Changes :  Idea 2:Subsequent Application of Topological Changes ST3 ST3 Idea 2:Subsequent Application of Topological Changes :  Idea 2:Subsequent Application of Topological Changes ST5 ST2 ST6 ST4 ST3 ST1 ST5 ST2 ST6 ST4 ST1 ST3 ST3 ST3 Why is Idea 2 useful?:  Why is Idea 2 useful? During inital 5-10 rearrengement steps many improved topologies are encountered Acceleration of likelihood improvment in initial optimization phase Enables fast optimization of random starting trees Remainder of this Talk:  Remainder of this Talk Motivation Introduction to phylogenetic tree inference Statistical inference methods Maximum Likelihood & associated problems Solutions: 2 simple heuristics parallel & distributed implementation Results Conclusion Availability & Future Work Basic Parallel & Distributed Algorithm:  Basic Parallel & Distributed Algorithm Basic idea: Distribute work by subtrees instead of topologies (e.g. parallel fastDNAml) Simple Master-Worker architecture Subsequent application of topological changes introduces non-determinism ST5 ST2 ST6 ST4 ST3 ST1 Basic Parallel & Distributed Algorithm:  Basic Parallel & Distributed Algorithm Basic idea: Distribute work by subtrees instead of topologies (e.g. parallel fastDNAml) Simple Master-Worker architecture Subsequent application of topological changes introduces non-determinism ST5 MPI_Send(ST3_ID, tree) ST6 ST4 ST3 ST1 ST2 Basic Parallel & Distributed Algorithm:  Basic Parallel & Distributed Algorithm Basic idea: Distribute work by subtrees instead of topologies (e.g. parallel fastDNAml) Simple Master-Worker architecture Subsequent application of topological changes introduces non-determinism ST5 MPI_Send(ST3_ID, tree) ST6 ST4 ST3 ST1 ST2 MPI_Send(ST2_ID, tree) Differences between Parallel & Distributed Algorithm:  Differences between Parallel & Distributed Algorithm Parallel: best tree list of max(20, #workers) maintained and merged at the master Parallel: Master distributes max(20, #workers) as toplogy-strings to workers for branch length optimization Distributed: Each worker maintains local best list of 20 trees Distributed: Worker performs fast branch length optimizations locally on all 20 trees -> returns only best topology to the master Sequential Results:  Sequential Results 50 distinct simulated 100-taxon alignments Measured average execution times & topological distance (RF-rate) from „true“ tree PHYML: 35.21 seconds, RF-rate: 0.0796 MrBayes: 945.32 seconds, RF-rate: 0.0741 RAxML: 29.27 seconds, RF-rate: 0.0818 9 distinct real alignments containing 101-1000 taxa Measured execution times & final likelihood values RAxML yields best-known likelihood for all data sets RAxML faster than PHYML & MrBayes Sequential Results: Real Data:  Sequential Results: Real Data Sequential Results: Real Data:  Sequential Results: Real Data Sequential Results: Real Data:  Sequential Results: Real Data Sequential Results: Real Data:  Sequential Results: Real Data Sequential Results: Real Data:  Sequential Results: Real Data Parallel Results: Speedup 1000_ARB:  Parallel Results: Speedup 1000_ARB Distributed Results: First Tests :  Distributed Results: First Tests Platforms: Infiniband-Cluster: 10 Intel Xeon 2.4 GHz Sunhalle: 50 Sun-Workstations for CS students Alignments: 1000_ARB 2025_ARB Larger trees to come .......... Results: Program executed correctly & terminated RAxML@home yielded best-known tree for 2025_ARB Biological Results: 1st ML 10.000-taxon tree:  Biological Results: 1st ML 10.000-taxon tree Calculated 5 parsimony starting trees + 3-4 initial rearrangement steps sequentially on Xeon 2.4GHz Further rearrangements of those 5 trees in parallel on 32 or 64 Xeon 2.66GHz at RRZE Accumulated CPU hours/tree ~ 3200hours Best ln likelihood: -949539 worst: -950026 Problems: Quality assessment? bootstrap not feasible Consense crashes for > 5 trees MrBayes/PHYML crash on 32-bit/4GB MrBayes crashed on Itanium Visualization? Conclusion:  Conclusion RAxML not able to handle protein data RAxML not able to perform model parameter optimization BUT: RAxML easy to parallelize/distribute Accurate & fast for large trees Significantly lower memory requirements than MrBayes/PHYML Conclusion: Imlement model parameter optimization & protein data in RAxML Availability & Future Work:  Availability & Future Work Further development & distribution of RAxML@home Big production runs with RAxML@home Survey: ML supertrees vs. integral trees Alignment split-up methods for ML supertrees RAxML implementation on GPUs RAxML2 download, benchmark, code: wwwbode.in.tum.de/~stamatak RAxML@home development: www.sourceforge.com/projects/axml

Add a comment

Related presentations