A biologist in e-Science

47 %
53 %
Information about A biologist in e-Science
Education

Published on October 30, 2008

Author: MarcoRoos

Source: slideshare.net

Description

Presentation for the BioAssist programmers face-to-face, Novemebr 17, 2008, Utrecht, The Netherlands. BioAssist is a nation-wide Bioinformatics support programme.

A biologists in e-Science? by Marco Roos Acknowledgements: Scott Marshall, Edgar Meij, Sophia Katrenko, Willem van Hage, Pieter Adriaans, Martijn Schuemie, Carole Goble, Dave de Roure, Katy Wolstencroft, Andy Gibson, the myGrid and myExperiment teams, many others who share their ideas, and… You! * Project or Area Liaison for OMII-UK (domain: Biology and Bioinformatics) BioAssist programmers meeting November 17, 2008, Utrecht, The Netherlands

A priori What does e-Science mean to you? BioAssistants say… Collaboration High throughput computing Grid Standardisation Scientific integration (tools, databases, scientific objects) Knowledge Information integration Biologists

BioAssistants say…

Collaboration

High throughput computing

Grid

Standardisation

Scientific integration (tools, databases, scientific objects)

Knowledge

Information integration

Biologists

Introducing myself A biologist

My prime interest Structure and function of DNA in the nucleus Escherichia coli Mouse fibroblast (skin) cells

My C.V. before e-Science e-Science since 2003 Molecular & Cellular biology (MSc) microscopy and image analysis of chromosome structure ‘ minor’ computer science Image analysis methods to measure DNA content in bull sperm cells (civil service) Chromatin structure & function (PhD molecular cytology) F.I.S.H., microscopy, image analysis, statistics 3-D chromosome structure during cell cycle (no luck) DNA movement in Escherichia coli (success) Human Transcriptome Map (post-doc) ‘ Traditional’ BioInformatics; data integration: gene expression to human genome sequence Analysis of regions of increased gene expression

Molecular & Cellular biology (MSc)

microscopy and image analysis of chromosome structure

‘ minor’ computer science

Image analysis methods to measure DNA content in bull sperm cells (civil service)

Chromatin structure & function (PhD molecular cytology)

F.I.S.H., microscopy, image analysis, statistics

3-D chromosome structure during cell cycle (no luck)

DNA movement in Escherichia coli (success)

Human Transcriptome Map (post-doc)

‘ Traditional’ BioInformatics; data integration: gene expression to human genome sequence

Analysis of regions of increased gene expression

How did I end up here? Marco Roos Biologist and bioinformatician Post-doc e-(bio)science, University of Amsterdam (BioRange/VL-e) Project or Area Liaison (PAL) OMII-UK Member BioAssist programme committee NBIC

Marco Roos

Biologist and bioinformatician

Post-doc e-(bio)science, University of Amsterdam (BioRange/VL-e)

Project or Area Liaison (PAL) OMII-UK

Member BioAssist programme committee NBIC

Why should a biologist be interested in e-science? BioAssistants guess… Involves Computation Interpretation of results Biology isn’t that interesting Reinvention of the wheel Lack of standards Sharing results Reshaping biology Synergy effect between these sciences Emerging Data driven science

BioAssistants guess…

Involves Computation

Interpretation of results

Biology isn’t that interesting

Reinvention of the wheel

Lack of standards

Sharing results

Reshaping biology

Synergy effect between these sciences

Emerging Data driven science

My prime interest Structure and function of DNA in the nucleus Escherichia coli Mouse fibroblast (skin) cells

Components controlling structure & function of DNA

Connecting the dots (example: protein interaction network in yeast)

Biomedical knowledge repository PubMed statistics http://www.ncbi.nlm.nih.gov/entrez >17 million citations >400,000 added/year ~70,000 searches/month … Does not compute Does not fit

1070 databases Nucleic Acids Research Jan 2008 (96 in Jan 2001) Proteomics Genomics Transcriptomics Protein sequence prediction Phenotypic studies Phylogeny Sequence analysis Protein Structure prediction Protein-protein interaction Metabolomics Model organism collections Systems Biology Epidemiology …

Proteomics

Genomics

Transcriptomics

Protein sequence prediction

Phenotypic studies

Phylogeny

Sequence analysis

Protein Structure prediction

Protein-protein interaction

Metabolomics

Model organism collections

Systems Biology

Epidemiology …

What do I do? A needy biologist

‘ Old school’ Bioinformatics A typical bioinformatician

‘ Old school’ Bioinformatics A biologist behind a computer who (just) learned perl

/* * determines ridges in htm expression table */ #include &quot;ridge.h&quot; int selecthtm(PGconn *conn, char *htmtablename, char *chromname, PGresult *htmtable) { char querystring[256]; sprintf(&quot;SELECT * FROM %s WHERE chrom = %s ORDER BY genstart&quot;, htmtablename, chromname); htmtable = PQexec(conn, querystring); return(validquery(htmtable, querystring)); } int is_ridge(PGresult *htmtable, int row, double exprthreshold, int mincount) /* determines if mincount genes in a row are (part of) a ridge */ /* pre: htmtable is valid and sorted on genStart (ascending) /* post: { if (mincount<=0) return TRUE; if (row>=PQntuples(htmtable)) return FALSE; if(PQgetvalue(htmtable, 0, PQfnumber(htmtable, &quot;movmed39expr&quot;)) < exprthreshold) { return FALSE; } return(is_ridge(htmtable, ++row, exprthreshold, --mincount)); } int main() { PGconn *conn; /* holds database connection */ char querystring[256]; /* query string */ PGresult *result; int i; conn = PQconnectdb(&quot;dbname=htm port=6400 user=mroos password=geheim&quot;); if (PQstatus(conn)==CONNECTION_BAD) { fprintf(stderr, &quot;connection to database failed. &quot;); fprintf(stderr, &quot;%s&quot;, PQerrorMessage(conn)); exit(1); } else printf(&quot;Connection ok &quot;); sprintf(querystring, &quot;SELECT * FROM chromosomes&quot;); printf(&quot;%s &quot;, querystring); result = PQexec(conn, querystring); if (validquery(result, querystring)) { printresults(result); } else { PQclear(result); PQfinish(conn); return FALSE; } PQclear(result); PQfinish(conn); return TRUE; } int printresults(PGresult *tuples) { int i; for (i=0; i< PQntuples(tuples) && i < 10; i++) { printf(&quot;%d, &quot;, i); printf(&quot;%s &quot;, PQgetvalue(tuples,i,0)); } return TRUE; } int validquery(PGresult *result, char *querystring) { printf(&quot; in validquery &quot;); if (PQresultStatus(result) != PGRES_TUPLES_OK) { printf(&quot;Query %s failed. &quot;, querystring); fprintf(stderr, &quot;Query %s failed. &quot;, querystring); return FALSE; } return TRUE; }

Theme Not an e-Science approach

The ‘spaghetti’ approach

Computational tools graveyard rephrasing David Shotton

Database survival: <20% ‘no problems’

Data graveyard quoting David Shotton

Why should a biologist be interested in e-science? Lots of data and knowledge to deal with Bioinformaticians make spaghetti and graveyards

Lots of data and knowledge to deal with

Bioinformaticians make spaghetti and graveyards

Bridging biology and computer science Marco Roos Biologist and bioinformatician Post-doc e-(bio)science, University of Amsterdam (BioRange/VL-e) Project or Area Liaison (PAL) OMII-UK Member BioAssist programme committee NBIC

Marco Roos

Biologist and bioinformatician

Post-doc e-(bio)science, University of Amsterdam (BioRange/VL-e)

Project or Area Liaison (PAL) OMII-UK

Member BioAssist programme committee NBIC

Empowering biologists and bioinformaticians

How could we be empowered? BioAssistants guess… Where a champion shirt Data integration solutions (warehouse or something) Communication between tools Sharing methods Talk the same language More metadata about wet/dry experiments Data directories (find data) Google-type search Sharing knowledge Gain knowledge by combining knowledge

BioAssistants guess…

Where a champion shirt

Data integration solutions (warehouse or something)

Communication between tools

Sharing methods

Talk the same language

More metadata about wet/dry experiments

Data directories (find data)

Google-type search

Sharing knowledge

Gain knowledge by combining knowledge

Experiment 1: Model based data integration Example: UCSC genome browser partOf * * Transcription Factor Binding Site

Experiment 2 An e -science approach for automated knowledge extraction from literature Roos, Marshall, et al., ISMB/ECCB, Vienna, 2007

An e -science approach for automated knowledge extraction from literature

An e-science approach Combining expertise Collaborating and sharing Technology

Combining expertise

Collaborating and sharing

Technology

Which diseases are associated with my protein of interest ‘EZH2’

Biological knowledge extraction Biological question/model Computational experiment Extracted knowledge >17 million citations +400,000/yr

Combining expertise Edgar Meij Information retrieval expert

Combining expertise Sophia Katrenko Machine learning expert

Combining expertise Willem van Hage Semantic web expert (and bass guitar player)

Combining expertise Towards a knowledge framework Computer scientist and bioinformatician Scott Marshall

The AIDA toolbox, Web Services for knowledge extraction and knowledge management

e -Science collaboration AIDA toolbox

“ Collaboration through Web Services” Bio-text mining expert BioSemantics group, Erasmus University Rotterdam Martijn Schuemie

“ Collaboration through Web Services” Biological Database expert Hideaki Sugawara

“ Collaboration through Web Services” e -bioscientist

A nice experiment design

A not so nice experiment design

A workflow Protocol for a computational experiment

05/06/09 BioAID

05/06/09 BioAID

Sharing and publishing my designs

Bio AID Disease Discovery workflow 05/06/09 BioAID AIDA AIDA OMIM service (Japan) AIDA ‘ Taverna shim’ Taverna ‘shim’

Bio AID Disease discovery workflow 05/06/09 BioAID

Bio AID Disease discovery workflow 05/06/09 BioAID

An insightful computational experiment

e -Science leveraging the use of more brains Want this…

e -Science leveraging the use of more brains … need this

Publish and share Publish & share research objects myExperiment >400 workflows >1000 registered users (< 1yr) Run workflows without Taverna (expert feature) Open to objects other than workflows Link out to other resources

Do I feel all powerful now? An e -biologist?

Tabular output Output good for viewing Useful, but sufficient? Query discoveries? Fits biological modelling? Basis for new experiments? Flexible enough?

Output good for viewing

Useful, but sufficient?

Query discoveries?

Fits biological modelling?

Basis for new experiments?

Flexible enough?

Underestimated: The brain bottleneck

Empower me with a ‘virtual brain’ * From P.J. Verschure, Journal of Cellular Biochemistry 2006, vol. 99(1), pg 23-34 My ws Your ws My ws Your ws My ws *

Workflow and Semantic Web Query Retrieve documents from Medline Extract proteins ( Homo sapiens ) Calculate ranking scores Create biological cross references Convert to table (html) Add documents (IDs) to semantic model Add proteins to semantic model Add scores to semantic model Add cross references to semantic model Add query to semantic model

Do I feel all powerful now? An e -biologist?

http://staff.science.uva.nl/~roos/ChromatinWorkgroup/

e -Laboratory factories

Conclusions How do we know when e-Science has succeeded? Not just accelerated but new A. When everyone is using Grid computing? B. When scientists make scientific advances that would not have happened otherwise? Slide from ‘The New e-Science’ by Dave de Roure

Conclusions e -Science is for people Empower them Grid, Semantic Web, Workflow, e-Laboratories for people! Let scientists be scientists Scientists require better, not perfect Workflow empowers scientists Empowering by Semantic Web and e-Laboratories in development

e -Science is for people

Empower them Grid, Semantic Web, Workflow, e-Laboratories for people!

Let scientists be scientists

Scientists require better, not perfect

Workflow empowers scientists

Empowering by Semantic Web and e-Laboratories in development

How would you like to be empowered? BioAssistants say Biologists asking understandable (solvable) questions Computer scientists giving understandable answers Education New technologies Good machinery Errors on the grid Task sharing; build in collaboration You

BioAssistants say

Biologists asking understandable (solvable) questions

Computer scientists giving understandable answers

Education

New technologies

Good machinery

Errors on the grid

Task sharing; build in collaboration

Project and Area Liaison Marco Roos Biologist and bioinformatician Post-doc e-(bio)science, University of Amsterdam (BioRange/VL-e) Project or Area Liaison (PAL) OMII-UK Member BioAssist programme committee NBIC

Marco Roos

Biologist and bioinformatician

Post-doc e-(bio)science, University of Amsterdam (BioRange/VL-e)

Project or Area Liaison (PAL) OMII-UK

Member BioAssist programme committee NBIC

How many brains do you want to use? – One?

Some?

Many?

Use your community myGrid/myExperiment OMII-UK You

End of presentation... Thank you http://adaptivedisclosure.org Some related presentations http://www.slideshare.net/dder/the-new-science-bangalore-edition http://www.slideshare.net/dullhunk/the-seven-deadly-sins-of-bioinformatics

Thank you

http://adaptivedisclosure.org Some related presentations

http://www.slideshare.net/dder/the-new-science-bangalore-edition

http://www.slideshare.net/dullhunk/the-seven-deadly-sins-of-bioinformatics

Add a comment

Related presentations

Related pages

Learn more about biologists | (e) Science News

engineering researchers at the University of Warwick that will help biologists devise new ways to combat the virus ... . Dr Gibbons said: "Research ...
Read more

Biology & Nature | (e) Science News

Biology Nature; Environment Climate; Health Medicine; Economics Math; Paleontology Archaeology; ... You can also follow (e) Science News on Twitter or Facebook
Read more

Chapter 1 - The Science of Biology - mpietrangelo

Chapter 1: The Science of Biology Go to Section: ... E. Science and Human Values p. 7 ... List some tools you think a biologist might use; ...
Read more

Royal Society of Biology - Wikipedia

The Biologist is a bimonthly British professional magazine published by the Royal Society of Biology. The magazine was initially established by one of the ...
Read more

The low down on e-science and grids for biology

The low down on e-science and grids for biology Carole Goble* ... and a biologist only knows it’s there when it breaks. The vision: a Grid-enabled scenario
Read more

Biology - Free E-Books

... and it facilitates the biologist's ability to make ... Biology comprises a large amount of information which is susceptible of being organized into ...
Read more

Science for Kids | Grades K - 5 | Kids.gov | USAGov

Wildlife Biologist Learn what it takes to be a biologist and how you can help protect wildlife . ... Life Science. Learn about plants, ...
Read more

eScienceCommons: Evolutionary biologists urged to adapt ...

Contact/News Media. Home; About; Video; Contact; The Science Scene
Read more

SWFSC Home Page - SWFSC

Award Recognizes NOAA Fisheries Biologist for Work on Vaquita 7/27/2016 International Search Reveals Genetic Evidence for New Species of Beaked Whale 7/22 ...
Read more