Bio2RDF should we do it

50 %
50 %
Information about Bio2RDF should we do it

Published on June 2, 2008

Author: fbelleau



The initial Bio2RDF project description shown at Semantic Web bird of a feather during ISMB2005.

Thank to Chistopher Baker, Kei Cheung, Johanne Luciano and Eric Neumann for initial inspiration.

Bio2RDF Should we do it ?

Bio2RDF Architecture XML KGML CSV RDF Bio*2RDF converter Sourceforge side Ready to use files available with CVS User side

The problem ● Too many knowledge sources available for life science scientists ● Too many formats (text, XML, HTML) ● New source each day with specialized tool or web interface ● Integration problem recognised by global community

One early solution ● Semantic web browser (BioDash) are in development - so what can we do in the mean time ? – Adopt the semantic web format (RDF) – BioPax, Swissprot already offer RDF documents – Select a strong knowledge tool to work with (Protege) – Convert popular knowledge source to RDF in a community effort (Bio2RDF)

What is RDF ● Simple XML format from the semantic web initiative of the W3C made of triples ● RDF is the predecessor of OWL ● Many tools from the computer science community already read RDF (Protege) ● Inference tools are available (RACER, FACT)

GO definition in RDF

What is Protege ● Mature software to work with knowledge bases and ontologies ● Open source Java application used by 30,000 users community ● Ontology editor with GUI interface ● It support RDF, natively ● Many specialized plugins – Visualisation – Import/Export to specialized file format ● Gives the experience of semantic browsing

Protege+RDF demo ● GO ontology in Protege ● BioPAX from the Reactome glycolysis pathway converted into RDF for visualisation with the TouchGraph plugin ● GO + MGI – An example of merging knowledge

Go.rdf in Protege

Go.rdf in Protégé Full text search

Go.rdf in Protégé Hierarchical browsing

Go.rdf in Protégé DAG graph

Citratecycle.kgml.rdf from Kegg with TouchGraph visualisation

Knowledge integration : Kegg+GO+Affymetrix+EntrezGene ● A central repository for tools to convert bioinformatics data and knowledge bases to RDF format ● A repository of ready to use RDF files for loading in Protege or other semantic tools ● A place for the semantic web life science community to develope and grow Who is in ?

