Using RDF to provide interoperable metadata:  Using RDF to provide interoperable metadata Timo Hannay Nature Publishing Group 5 February 2004 Nature Publishing Group:  Nature Publishing Group Publishers of Nature, the international weekly journal of science Also 8 other Nature research journals and 7 Nature reviews titles About 30 other specialist titles (e.g., EMBO J and British Journal of Cancer) Scientific reference works, including the Nature Encyclopedia of Life Sciences NPG New Technology:  NPG New Technology Deploy emerging web technologies at Nature.com and other NPG sites Particular interests in: Web Services such as RSS and OAI Semantic Web and scientific ontologies XML technologies like XSLT and SVG Databases in scientific publishing Pioneering database publishing project: The AfCS-Nature Signaling Gateway Overview:  Overview RDF RSS Urchin: An RSS/RDF application What is RDF?:  What is RDF? Resource Description Framework Simple data model for capturing factual statements in an interoperable form Can be expressed as RDF/XML but also other forms (e.g., N3 or visually) The foundation of the Semantic Web but not limited to these applications The RDF ‘triple’ data model:  The RDF ‘triple’ data model <http://dx.doi.org/10.1038/425003a> <http://purl.org/dc/elements/1.1/creator> Declan Butler “Declan Butler”. N3: Visual: More about RDF:  More about RDF RDF is a directed labelled graph Subjects and predicates must use URIs, but objects can use literals Nodes can have any number of incoming or outgoing arcs, and they can be anonymous Arcs and nodes can be linked to form arbitrarily large networks of statements Querying involves the extraction of sub-graphs according user-defined criteria Provenance is important (but not covered here) The Semantic Web vision is essentially one of billions of interlinked RDF triples RDF-related technologies:  RDF-related technologies RDF/XML builds on: XML XML Namespaces RDF has been built on by: RDF Schema The Web Ontology Language, OWL The Semantic Web stack:  The Semantic Web stack Overview:  Overview RDF RSS Urchin: An RSS/RDF application What is RSS?:  What is RSS? A lightweight XML format for syndicating news titles, links and descriptions Originally developed by Netscape, more recently adopted by bloggers Consumed by (e.g.): Users with desktop readers Webmasters who want to embed titles from other sites in their own pages Structure of an RSS feed:  Structure of an RSS feed Example of an RSS feed:  Example of an RSS feed <?xml version="1.0" encoding="ISO-8859-1" ?> <!DOCTYPE rss SYSTEM "http://www.bbc.co.uk/syndication/feeds/news/rss-0.91.dtd"> <rss version="0.91"> <channel> <title>BBC News | News Front Page | UK Edition</title> <link>http://news.bbc.co.uk/go/click/rss/0.91/public/-/1/hi/default.stm</link> <description>Updated every minute of every day</description> <language>en-gb</language> <lastBuildDate>Tue, 16 Sep 03 09:21:32 GMT</lastBuildDate> <copyright>Copyright: (C) British Broadcasting Corporation, http://news.bbc.co.uk/2/shared/bsp/hi/services/copyright/html/default.stm</copyright> <docs>http://www.bbc.co.uk/syndication/</docs> <image> <title>BBC News</title> <url>http://news.bbc.co.uk/nol/shared/img/bbc_news_120x60.gif</url> <link>http://news.bbc.co.uk</link> </image> <item> <title>Hutton witnesses face tough questions</title> <description>Witnesses at the inquiry into Dr David Kelly's death will face cross-examination, a day after a BBC boss and a spy chief gave evidence.</description> <link>http://news.bbc.co.uk/go/click/rss/0.91/public/-/1/hi/uk_politics/3111926.stm</link> </item> <item> <title>Deadly blast ends Japan siege</title> <description>At least three people are killed in an explosion in an office where a disgruntled worker had taken hostages.</description> … Example of an RSS feed:  Example of an RSS feed Different versions of RSS:  Different versions of RSS 0.9 0.91 0.92 2.0 “Atom” 1.0 Simple: Plain XML Extensible: RDF/XML = most popular formats RDF and RSS:  RDF and RSS RSS 1.0 Current NPG RSS Feeds:  Current NPG RSS Feeds Nature Science Update Nature Materials Update Nature Signaling Update NatureJobs editorial British Dental Journal TOC All in RSS 1.0 format Overview:  Overview RDF RSS Urchin: An RSS/RDF application Urchin:  Urchin A Perl-based framework for generating, reading, storing, aggregating and filtering RSS news feeds Takes an RDF-centric approach to allow for flexibility in storing and querying metadata Conceived and designed by NPG New Technology, funded by JISC Released as open source and available from http://urchin.sourceforge.net/ (currently v0.81) Urchin architecture:  Urchin architecture Urchin data model:  Urchin data model Urchin data model (RDF section):  Urchin data model (RDF section) Ways to filter Urchin output:  Ways to filter Urchin output Boolean keyword queries: Easy and intuitive but limited to preset fields Full RDF querying: More involved but allows filtering on any metadata Bayesian filtering (experimental): Train by example when keywords won’t do Breaking news (experimental): Automatic identification of recently popular phrases Ways to present Urchin output:  Ways to present Urchin output RSS (0.91 and 1.0 supported natively) HTML (simple templates included) XSL transformation of RSS 1.0: Allows almost any kind of HTML or text display HTML::Template: Similar functionality to XSLT but with a different syntax Slide28:  AGGREGATE bbc Microsoft AND (security OR virus OR worm) Slide29:  Select ?item From ?item->nj:advertises{?job} Where ?job->nj:city = 'Cambridge' Slide30:  Select ?article From ?article->dc:creator{?author}, Where ?item->dc:creator{?author} And $item->rss:link = 'http://dx.doi.org/10.1038/427005a' RDF and the Network Data Model:  RDF and the Network Data Model NPG and Oracle investigating the suitability of the NDM to store and query RDF-encoded information Storage looks OK: Can hold directed labelled graphs Allows URIs, literals and blank nodes Can include provenance information Querying may need more development: Can extract sub-graphs but performance and scalability need to be tested RDF/XML import and export would be desirable Support for RDFS- and OWL-based inferencing

