advertisement

STAR: Recombination site prediction

50 %
50 %
advertisement
Information about STAR: Recombination site prediction
Education

Published on February 23, 2009

Author: allPowerde

Source: slideshare.net

Description

The presentation was given at the CIBCB, 2005, in San Diego about our approach to predict recombination sites in protein sequence. Recombination is the method of choice for designing new proteins with desired new or enhanced properties.

The publication is :
Bauer, D.C., Bodén, M., Thier, R. and Gillam, E. M. “STAR: Predicting recombination sites from amino acid sequence.” BMC Bioinformatics, 2006 Oct 8; 7:437. PMID: 17026775
advertisement

Predicting structural disruption caused by crossover : a machine learning approach Denis C. Bauer Talk CIBCB 2005

Outline Introduction in Protein Design Theory of SCHEMA Our Approach Results Summary

Introduction in Protein Design

Theory of SCHEMA

Our Approach

Results

Summary

Protein Biological Functions Proteins are fundamental components of all living cells Messenger Function (e.g. Hormones) Catalystic Function (e.g. Enzymes) Regulatoy Function (e.g. Antibodies) Protein Design for Industry and Medicine Better adjusted New function Introduction

Biological Functions

Proteins are fundamental components of all living cells

Messenger Function (e.g. Hormones)

Catalystic Function (e.g. Enzymes)

Regulatoy Function (e.g. Antibodies)

Protein Design for Industry and Medicine

Better adjusted

New function

Protein Structure Primary Structure Secondary Structure Tertiary Structure Quaternary Structure Pictures from : Principles of BIOCHEMISTRY, Horton, Moran, Ochs, Rawn, Scrimgeours Introduction

Primary Structure

Secondary Structure

Tertiary Structure

Quaternary Structure

Protein Design Creating new amino acid sequences Huge sequence space Not every possible sequence is stable Solution: using sequences which already exist Introduction Gly Ala – Glu Thr Pro Val Gly Asp – – – Glu Thr Pro – – – – – – Gly Ala – Glu Pro – – – 20 100 possible Amino Acid sequences

Creating new amino acid sequences

Huge sequence space

Not every possible sequence is stable

Benefit of Recombination KEMHQPLTFGELENLPLLNTDKPVQALM Problem: how to identify recombination sites ? Introduction KIPDELGLIFKFEAPGRVTRVLSSQ … M H K L N E K A P TIKELPQPPTFGELKKLPLLNTDKPVQAL M L K P G K G MKIADELGEIFKFEAPGRVTRYLSSQ… A P E L Y A Better resistant to heat Higher performance Higher performance Better resistant to heat Mayfly Lives where its hot MKIPDELGLIFKFEAPGRVTRALSSQ… MKIPDELGLIFKFEAPGRVTRALSSQ… KEMHQPLTFGELENLPLLNTDKPVQAL KEMHQPLTFGELENLPLLNTDKPVQAL

SCHEMA Research group of Prof. Francis Arnold Idea: Positions where the least interaction are disrupted SCHEMA SCHEMA profile

Research group of Prof. Francis Arnold

Idea: Positions where the least interaction are disrupted

Limitations 3D Structure necessary Problem: hard to derive for some proteins time consuming expensive Solution: Disengaging from 3D structure SCHEMA

3D Structure necessary

Problem: hard to derive for some proteins

time consuming

expensive

Our approach

Alternative to SCHEMA 3D Structure Information Schema Alg Schema Score Predicting Sequence Benefit: All Proteins can be processed Our Approach

Predicting Schema-Profile Predicted Schema Score Sequence Support Vector Regression Predictive Model * * Bodén, M., Yuan, Z. and Bailey, T. L. Prediction of protein continuum secondary structure with probabilistic models. submitted Our Approach Model Bidirectional Recurrent Network Feed Forward Neural Network

Results Table 1 Results for all approaches. r = correlation coefficient (ideally 1), devA = Root Mean Square Error (RMSE) normalized by the standard deviation (ideally 0). Results 0.62 0.83 SVR nu 0.63 0.82 SVR eps 0.52 0.88 BRNN 0.57 0.86 FFNN devA r Method

Results Results

Results Results

Refinements Contact Numbers Predicting Model Predicted Schema Score predicted Input features Solvent Accessibility Score CC 0.88 0.88 0.6 Ensemble 0.88 Results ML model ML model ML model ML model

However… Only a limited number of connections are considered Broken connections are reconnected after recombination

Only a limited number of connections are considered

Broken connections are reconnected after recombination

Summary Design proteins with recombination rather than from scratch Identifiy recombination site Idea: finding the sites where the least interactions are disrupted (SCHEMA) Predicting SCHEMA-score to overcome the limitation SCHEMA too limited to be the only means for recombination site prediction Future work All interactions Actual recombination process

Design proteins with recombination rather than from scratch

Identifiy recombination site

Idea: finding the sites where the least interactions are disrupted (SCHEMA)

Predicting SCHEMA-score to overcome the limitation

SCHEMA too limited to be the only means for recombination site prediction

Future work

All interactions

Actual recombination process

Acknowledgments Supervisors Dr. Mikael Bod é n and Dr. Ricarda Thier Dr. Zheng Yuan Prof. Francis Arnold’s research group

Supervisors Dr. Mikael Bod é n and Dr. Ricarda Thier

Dr. Zheng Yuan

Prof. Francis Arnold’s research group

Thank you Ref: C. A. Voigt, C. Martinez, Z.-G. Wang, S. L. Mayo, and F. H. Arnold, Protein building blocks preserved by recombination, Nat Struct Biol, vol. 9, no. 7, pp. 553-558, Jul 2002. Meyer MM, Silberg JJ, Voigt CA, Endelman JB, Mayo SL, Wang ZG, Arnold FH. Library analysis of SCHEMA-guided protein recombination. Protein Sci. 2003 Aug;12(8):1686-93. Bodén, M., Yuan, Z. and Bailey, T. L. Prediction of protein continuum secondary structure with probabilistic models. submitted.

PDB 1zg4

Recombination Site Identification Recombination vs Mutagenesis or Design from scratch Higher fraction of functional proteins Higher diversity  higher chance to find a better hybrid Requirement Identify recombination site Identify which segments are useful Identify beneficial segment combinations Existing methods SCHEMA (Hybrid evaluation : avoid breaking connections) FamClash (Hybrid evaluation : avoid changing properties of residue pairs) STAR (Site suggestion according to strucural compactness) Known methods too limited to be a good means for recombination site prediction http://www.che.caltech.edu/groups/fha/

Recombination vs Mutagenesis or Design

from scratch

Higher fraction of functional proteins

Higher diversity  higher chance to find

a better hybrid

Requirement

Identify recombination site

Identify which segments are useful

Identify beneficial segment combinations

Existing methods

SCHEMA (Hybrid evaluation : avoid breaking connections)

FamClash (Hybrid evaluation : avoid changing properties of

residue pairs)

STAR (Site suggestion according to strucural compactness)

Known methods too limited to be a good means for

recombination site prediction

Possible approaches Identify a new measure for evaluating hybrids (derived from datasets of biologically produced hybrids) Include more information in the decision process Sequence/Structure (SCHEMA) Chemical features (FamClash) Predicting important residues for structure and/or function Predicting enzyme function from protein sequence Substitution tolerance Hydrophobic patterning Surface clefts or binding sites Solvent accessibility Domains/motifs of parents

Identify a new measure for evaluating hybrids (derived from datasets of biologically produced hybrids)

Include more information in the decision process

Sequence/Structure (SCHEMA)

Chemical features (FamClash)

Predicting important residues for structure and/or function

Predicting enzyme function from protein sequence

Substitution tolerance

Hydrophobic patterning

Surface clefts or binding sites

Solvent accessibility

Domains/motifs of parents

Add a comment

Related presentations

Related pages

STAR: predicting recombination sites from amino acid ...

Results. We present STAR, Site Targeted Amino acid Recombination predictor, which produces a score indicating the structural disruption caused by ...
Read more

III. RECOMBINATION (Lewin, Chpt. 33) - CCGB | index

Site-specific recombination occurs between particular ... What are the predictions of the Holliday model and the double-strand-break model for whether ...
Read more

RSSsite: a reference database and prediction tool for the ...

RSSsite: a reference database and prediction tool for the identification of cryptic Recombination Signal Sequences in human and murine genomes Ivan ...
Read more

STAR: predicting recombination sites from amino acid sequence

We present STAR, Site Targeted Amino acid Recombination predictor, ... Example predictions contrasted with those of alternative tools, ...
Read more

Recombination rate variation and speciation: theoretical ...

predictions and empirical results from rabbits and mice Recombination rate variation and speciation: ... and selection at linked sites remains a challenge.
Read more

Efficient genome engineering by targeted homologous ...

Here we investigate whether artificial transcription activator-like effector nucleases ... recombination at a specific site ... target prediction ...
Read more

Deviation of Time-Resolved Luminescence Dynamics in MWIR ...

Peter M. Johnson - Deviation of Time-Resolved Luminescence Dynamics in MWIR Semiconductor Materials from jetzt kaufen. ISBN: 9781288280681, Fremdsprachige ...
Read more

Soccer football predictions, statistics, bet tips, results

Site for soccer football statistics, predictions, bet tips, results and team information.
Read more

Expert soccer football predictions and match reviews

Site for soccer football statistics, predictions, bet tips, results and team information.
Read more