advertisement

Open Calais Release 4.0

33 %
67 %
advertisement
Information about Open Calais Release 4.0
Technology

Published on January 16, 2009

Author: KristaThomas

Source: slideshare.net

Description

A brief, entry-level overview of version 4.0 of the Calais Web service. Calais 4.0 automatically connects publishers to the exploding ecosystem of Linked Data assets on the Web, and helps them syndicate their metadata to reach downstream readers via search engines,news aggregators, 'related stories' recommendation services and more.
advertisement

Calais Thomson Reuters Calais Initiative: Calais 4.0 ~ January, 14, 2009 Thomas (“Tom”) Tague and Krista Thomas

Overview Going to discuss five basic topics What is Calais? Why we’re doing it & what our goals are How it works / What’s under the hood? A few examples Where it’s headed

Going to discuss five basic topics

What is Calais?

Why we’re doing it & what our goals are

How it works / What’s under the hood?

A few examples

Where it’s headed

Calais? What’s Calais? As seen from U.K & the Continent As seen from North America As seen by us

Calais? What’s Calais? A semantic metadata generation service that extracts entities, facts and events from unstructured text Creates linkages from extracted entities to linked data ecosystem Provides a transportation layer for rich semantic metadata from producers to consumers Details to follow….

A semantic metadata generation service that extracts entities, facts and events from unstructured text

Creates linkages from extracted entities to linked data ecosystem

Provides a transportation layer for rich semantic metadata from producers to consumers

Details to follow….

Why We’re Doing It Two simple answers: Hyper-evolution of capabilities – better, faster, stronger The walled garden content world

Two simple answers:

Hyper-evolution of capabilities – better, faster, stronger

The walled garden content world

Our Goals / The Capabilities We Want to Deploy Let’s state them here and then walk through why we have these goals Derive semantic metadata from textual assets Use that semantic metadata to create entry points into the linked data ecosystem Provide a simple mechanism for the sharing of semantic metadata about textual content assets

Let’s state them here and then walk through why we have these goals

Derive semantic metadata from textual assets

Use that semantic metadata to create entry points into the linked data ecosystem

Provide a simple mechanism for the sharing of semantic metadata about textual content assets

1: Semantics from Text: The Text Problem People consume text Most of it isn’t semantically enabled Most of it won’t be semantically enabled This isn’t about standards – microfromats vs RDFa vs whatever. Why: Latency, cost and short shelf-life

People consume text

Most of it isn’t semantically enabled

Most of it won’t be semantically enabled

This isn’t about standards – microfromats vs RDFa vs whatever.

Why: Latency, cost and short shelf-life

1: Semantics from Text: The Text Problem Target areas where: The economics don’t support metadata creation The value of metadata is potentially high The value of aggregated metadata is potentially extremely high Seconds Years Seconds Years Tweets Blogs News Scient. Pubs Great Novels Latency Shelf Life

Target areas where:

The economics don’t support metadata creation

The value of metadata is potentially high

The value of aggregated metadata is potentially extremely high

2: Getting from Text to the Linked Data Ecosystem

The Linked Data Cloud

3: Semantic Metadata Transport Layer I’m a content producer. We’ve loaded the car with rich semantic metadata I’m sharing it within my four walls How do I transport it to my consumers? RSS / Atom, XML, Proprietary data feeds, Content API’s

I’m a content producer. We’ve loaded the car with rich semantic metadata

I’m sharing it within my four walls

How do I transport it to my consumers?

RSS / Atom, XML, Proprietary data feeds, Content API’s

How it Works – Under the Hood of Calais

How it Works – Under the Hood of Calais Calais Web Service ClearForest NLP Engine Rule Base Lexicons RDF Disambig. Engine Reference Data Assets Metadata Management Document Level Metadata Entity Level Linked Data and … Output Formatting Stat Tools

How You Can Use It – the SemHead version Send unstructured text Get back document categorization, entities, facts and events – with document and entity level URI’s Syndicate Metadata Send unstructured text Share /syndicate the document GUID Access Endpoints Use entity level URI Access entity level Linked Data endpoints & TR Content

Send unstructured text

Get back document categorization, entities, facts and events – with document and entity level URI’s

Syndicate Metadata

Send unstructured text

Share /syndicate the document GUID

Access Endpoints

Use entity level URI

Access entity level Linked Data endpoints & TR Content

Entities, Facts & Events Anniversary, City, Company, Continent, Country, Currency, EmailAddress, EntertainmentAwardEvent, Facility, FaxNumber, Holiday, IndustryTerm, MarketIndex, MedicalCondition, MedicalTreatment, Movie, MusicAlbum, MusicGroup, NaturalDisaster, NaturalFeature, OperatingSystem, Organization, Person, PhoneNumber, Product, ProgrammingLanguage, ProvinceOrState, PublishedMedium, RadioProgram, RadioStation, Region, SportsEvent, SportsGame, SportsLeague, Technology, TVShow, TVStation, URL Acquisition, Alliance, AnalystEarningsEstimate, AnalystRecommendation, Bankruptcy, BonusShares, BusinessRelation, Buybacks, CompanyAffiliates, CompanyCustomer, CompanyEarningsAnnouncement, CompanyEarningsGuidance, CompanyInvestment, CompanyLegalIssues, CompanyLocation, CompanyMeeting, CompanyReorganization, CompanyTechnology, CompanyTicker, ConferenceCall, CreditRating, EmploymentRelation, FamilyRelation, FDAPhase, IPO, JointVenture, ManagementChange, Merger, MovieRelease, MusicAlbumRelease, PatentFiling, PatentIssuance, PersonAttributes, PersonCommunication, PersonEducation, PersonEmailAddress, PersonPolitical, PersonPoliticalPast, PersonProfessional, PersonProfessionalPast, PersonRelation, PersonTravel, Quotation, SecondaryIssuance, StockSplit

Anniversary, City, Company, Continent, Country, Currency, EmailAddress, EntertainmentAwardEvent, Facility, FaxNumber, Holiday, IndustryTerm, MarketIndex, MedicalCondition, MedicalTreatment, Movie, MusicAlbum, MusicGroup, NaturalDisaster, NaturalFeature, OperatingSystem, Organization, Person, PhoneNumber, Product, ProgrammingLanguage, ProvinceOrState, PublishedMedium, RadioProgram, RadioStation, Region, SportsEvent, SportsGame, SportsLeague, Technology, TVShow, TVStation, URL

Acquisition, Alliance, AnalystEarningsEstimate, AnalystRecommendation, Bankruptcy, BonusShares, BusinessRelation, Buybacks, CompanyAffiliates, CompanyCustomer, CompanyEarningsAnnouncement, CompanyEarningsGuidance, CompanyInvestment, CompanyLegalIssues, CompanyLocation, CompanyMeeting, CompanyReorganization, CompanyTechnology, CompanyTicker, ConferenceCall, CreditRating, EmploymentRelation, FamilyRelation, FDAPhase, IPO, JointVenture, ManagementChange, Merger, MovieRelease, MusicAlbumRelease, PatentFiling, PatentIssuance, PersonAttributes, PersonCommunication, PersonEducation, PersonEmailAddress, PersonPolitical, PersonPoliticalPast, PersonProfessional, PersonProfessionalPast, PersonRelation, PersonTravel, Quotation, SecondaryIssuance, StockSplit

Extending Calais’ Reach More than just a web service – a growing collection of tools and applications to make it valuable in the real world Calais Browser Extensions Gnosis Content Management Tools WordPress Drupal UIMA Development Tools & Libraries PHP Ruby JAVA .NET Applications And more… TopBraid RSS Tagger Powerhouse LinkedFacts Wirecatch FeedShaver

More than just a web service – a growing collection of tools and applications to make it valuable in the real world

Calais progress to date Launched in late January, 2008 9,000 developers have joined OpenCalais.com Approx. 1 million content ‘transactions’ per day Delivered four major update releases Lots of interesting apps The Mail & Guardian Online ( http:// www.mg.co.za / ) www.powerhousemuseum.com Gist.whistlehog.com http://www.semanticproxy.com

Launched in late January, 2008

9,000 developers have joined OpenCalais.com

Approx. 1 million content ‘transactions’ per day

Delivered four major update releases

Lots of interesting apps

The Mail & Guardian Online ( http:// www.mg.co.za / )

www.powerhousemuseum.com

Gist.whistlehog.com

http://www.semanticproxy.com

Example: The Mail & Guardian Online, South African Newspaper Using Calais to metatag new and historical articles, and: Build an index or topics A-Z Pull out automatic related articles or pictures Create news alerts on companies or people Pull up maps for the countries named in articles Predict readers’ interests based on browsing habits Create tag clouds, showing popular subjects, people, etc. Using Calais to optimize search and navigation; drive consumer engagement

Using Calais to metatag new and historical articles, and:

Build an index or topics A-Z

Pull out automatic related articles or pictures

Create news alerts on companies or people

Pull up maps for the countries named in articles

Predict readers’ interests based on browsing habits

Create tag clouds, showing popular subjects, people, etc.

Example: Gist - today’s news filtered by people, places & events GIST uses Calais to prioritize stories, rank newsmakers & reveal trends / reader demand. It automatically aggregates multiple news sources and slots them into topic.

Example: The Powerhouse Museum in Sydney Using Calais to tag historical archives & using tags as search terms

Example: IT Healthcare News Using Calais to surface ambient “related content”

Examples Those are examples of first generation uses. Some of what we’re seeing in the pipeline: Social Resume analysis Investigative Journalism* Museum metadata coalitions

Those are examples of first generation uses. Some of what we’re seeing in the pipeline:

Social Resume analysis

Investigative Journalism*

Museum metadata coalitions

Investigative Journalism FOIA Contract Documents Calais Web Service Company:Person FamilyRelation News Calais Web Service Company:Contract Company:Affiliation Big Fuzzy Graph

What’s new in Release 4? Release 4 – What’s New? Linked data for approximately 25 entities A start at Thomson Reuters contributed content Metadata hosting and transport Basic French Published RDFS Ontology New entities / relationships Products Competitive intelligence Expanded document level categorization

Release 4 – What’s New?

Linked data for approximately 25 entities

A start at Thomson Reuters contributed content

Metadata hosting and transport

Basic French

Published RDFS Ontology

New entities / relationships

Products

Competitive intelligence

Expanded document level categorization

What’s in the Pipeline? 2009 (this is a fuzzy list) Person disambiguation @ domain level? Other disambiguation Dramatic expansion of endpoints (entities & events) Calais as hub Exposure of the IDE? User managed lexicons Languages Opt-in SPARQL Endpoint?

2009 (this is a fuzzy list)

Person disambiguation @ domain level?

Other disambiguation

Dramatic expansion of endpoints (entities & events)

Calais as hub

Exposure of the IDE?

User managed lexicons

Languages

Opt-in SPARQL Endpoint?

www.opencalais.com Gallery – code and applications examples Forums Documentation

www.opencalais.com

Gallery – code and applications examples

Forums

Documentation

Add a comment

Related presentations

Presentación que realice en el Evento Nacional de Gobierno Abierto, realizado los ...

In this presentation we will describe our experience developing with a highly dyna...

Presentation to the LITA Forum 7th November 2014 Albuquerque, NM

Un recorrido por los cambios que nos generará el wearabletech en el futuro

Um paralelo entre as novidades & mercado em Wearable Computing e Tecnologias Assis...

Microsoft finally joins the smartwatch and fitness tracker game by introducing the...

Related pages

Thomson Reuters | Open Calais

Thomson Reuters Open Calais™ is now upgraded with a new and enhanced tagging engine ... Open PermID. Try Our Demo. Access API. LATEST NEWS: ...
Read more

Open Calais AutoTag for Umbraco - Home

Open Calais AutoTag for Umbraco ... Open Calais AutoTags Umbraco Package for .NET 4.0 ... downloads: 69: rating 0 ratings Review this release: activity ...
Read more

The Semantic Web Gang discusses Calais 4.0, Linked Data ...

... and talk with Tom Tague of Thomson Reuters about this week's release of version 4.0 of the Open Calais ... Web Gang discusses Calais 4.0, ...
Read more

A Review of Three Natural Language Processors, AlchemyAPI ...

A Review of Three Natural Language Processors, AlchemyAPI, OpenCalais, and ... the metadata that Open Calais can ... New Release: Small ...
Read more

The Linked Content Economy: Thomson Reuters Open Calais ...

The Linked Content Economy: Thomson Reuters Open Calais Toolkit to Create More Intelligent Applications. Thomson Reuters today announce Calais 4.0, 'a web ...
Read more

Releases for OpenCalais | Drupal.org

Release notes. Issue #1433898 by ... Lost most of my OpenCalais categories ... Issue #2543746: Will the Open Calais module work after Aug 31, 2015? Download
Read more

Home - BBC News

BBC News provides trusted World and UK news as well as local and regional perspectives ... Lorry protest causes Calais disruption. ... Sport Wales 4-0 Moldova.
Read more

Centrafuse

Announcing the release of Centrafuse Auto 4.4.9. ... scalable platform that leverages open interfaces for core feature components. ...
Read more

SuSE - Enterprise Linux, Openstack cloud, software-defined ...

Modernize your infrastructure with SUSE Linux Enterprise servers, ... Cloud Computing Fundamentals Open Source Software-defined Storage. Featured Solution
Read more