advertisement

Semantics at the multimedia fragment level SSSW 2013

0 %
100 %
advertisement
Information about Semantics at the multimedia fragment level SSSW 2013
Technology

Published on July 13, 2013

Author: troncy

Source: slideshare.net

Description

"Semantics at the multimedia fragment level or how enabling the remixing of online media" - Invited Talk given at the Semantic Web Summer School (SSSW), 12 July 2013
advertisement

Semantics at the multimedia fragment level or how enabling the remixing of online media Raphaël Troncy <raphael.troncy@eurecom.fr>

11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 2

Once upon a time … 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 3

… leading to sharing Media Fragments  Publishing status message containing a Media Fragment URI Use a ‘#’ ! Highlight a video sequence Highlight a region to pay attention to 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 4

What are Media Fragments? t0 20 35temporal media fragment spatial media fragment track media fragment 11/07/2013 - - 5Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013

Media Fragments (temporal) 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 6 Fragment beginning Fragment endPlayback progress Original resource length

Media Fragments (spatial) + Demo 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 7 semi-opaque overlay highlighted fragment

Media Fragments URIs  Bookmark / Share parts (fragments) of audio/video content  Annotate media fragments  Search for media fragments  Mash-ups  Conserve bandwidth 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 8 http://www.w3.org/TR/media-frags-reqs/ http://www.w3.org/TR/media-frags/

Video annotation 11/07/2013 - - 9Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013

Video interactivity Cubism Expressionism Fauvism FACETS / PROPERTIES OF CONCEPT CONCEPT IN PLAYER CONTENT ENRICHMENT 11/07/2013 - - 10Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013

Video Accessibility  What is required to make video accessible on the Web?  Technologies:  Annotating: automatic (speech transcription) and manual (social collaborative annotation tool)  Addressing: pointing to, retrieving, transmitting only parts of media  Rendering: video visualization for the impaired, Braille output 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 11 Benchmarking: Sphinx, HTK, Julius

11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 12

Semantic indexing at the fragment level 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 13 Benchmarking: Sphinx, HTK, Julius  NER on subtitle blocks  Interlinking with the Linked Data Cloud to enable semantic search

What is a Named Entity recognition task?  A task that aims to locate and classify the name of a person or an organization, a location, a brand, a product, a numeric expression including time, date, money and percent in a textual document 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 14

NER Tools and Web APIs  Standalone software GATE Stanford CoreNLP Temis  Web APIs http://nerd.eurecom.fr/ 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 15

 Compare performances of NER and NEL tools  Understand strengths and weaknesses of different Web APIs  Adapt NER processing to different context  (Learn how to) Combine NER (/ NEL) tools  Participate in various benchmarks NERD: Named Entity Recognition and Disambiguation 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 16

What is NERD? REST API2ontology1 UI3 1 http://nerd.eurecom.fr/ontology 2 http://nerd.eurecom.fr/api/application.wadl 3 http://nerd.eurecom.fr 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 17

11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 18/15 Alchemy API DBpedia Spotlight Evri Extractiv Lupedia Open Calais Saplo Wikimeta Yahoo! Zemanta Language EN,FR, GR,IT, PT,RU, SP,SW EN GR* PT* SP* EN,I T EN EN,FR, IT EN,FR SP EN, SW EN,FR SP EN EN Granularity OEN OEN OED OEN OEN OEN OED OEN OEN OED Entity position N/A char offset N/A word offset range of chars char offset N/A POS offset range of chars N/A Classification schema Alchemy DBpedia FreeBase Scema.or g Evri DBpedia DBpedia LinkedM DB Open Calais N/A ESTER Yahoo FreeBase Number of classes 324 320 5 34 319 95 5 7 13 81 Response Format JSON MicroF XML RDF HTML JSON RDF XML HTM L JSO N RDF HTML JSON RDF XML HTML JSON RDFa XML JSON MicroF ormat JSON JSON XML JSON XML XML JSON RDF Quota (calls/day) 30000 unl 300 0 3000 unl 50000 1333 unl 5000 10000 Factual comparison of 10 Web NER tools

Aligned the taxonomies used by the extractors NERD Ontology 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 19

NERD type Occurrence Person 10 Organization 10 Country 6 Company 6 Location 6 Continent 5 City 5 RadioStation 5 Album 5 Product 5 ... ... Building the NERD Ontology 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 20

NERD REST API GET, POST, PUT, DELETE /document /user /annotation/{extractor} /extraction /evaluation ... JSON “entities” : [{ “entity”: “Tim Berners-Lee” , “type”: “Person” , “uri”: "http://dbpedia.org/resource/Tim_berners_lee", “nerdType”: "http://nerd.eurecom.fr/ontology#Person", “startChar”: 30, “endChar”: 45, “confidence”: 1, “relevance”: 0.5 }] Rizzo G., Troncy R. (2012), NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Web Extraction Tools. In: European chapter of the Association for Computational Linguistics (EACL'12), Avignon, France. RDF 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 21

NERD meets NIF Model documents through a set of strings deferencable on the Web : offset_23107_ 23110 a str:String ; str:referenceContext :offset_0_26546 . : offset_23107_ 23110 sso:oen dbpedia:W3C. dbpedia:W3C rdf:type nerd:Organization . Map string to entity Classification Rizzo G, Troncy R., Hellmann S. and Bruemmer M. (2012), NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud. In: (LDOW'12) Linked Data on the Web (WWW'12), Lyon, France. 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 22

NERD User Dashboard 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 23

NERD User Interface 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 24

History of NER benchmarks  CoNLL 2003 and CoNLL 2005  schema (4 types): person, organization, location and miscellaneous  ACE 2004, ACE 2005 and ACE 2007  schema (7 types): person, organization, location, facility, weapon, vehicle and geo-political entity  entity recognition, co-ref, find relationships among entities extracted  TAC 2009 (Knowledge Base Track)  schema (3 types): person, organization and location  create a knowledge base from the named entities extracted  ETAPE 2012 (Named Entity Task)  schema: Quaero (7 main types, 32 sub-types)  MSM 2013: tweet corpus !  schema (4 types): person, organization, location, miscellaneous 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 25

ETAPE 2012 challenge genre train dev test sources TV news 7h 40m 1h 40m 1h 40m BFM Story, Top QUestions (LCP) TV debates 10h 30m 5h 10m 5h 10m Pile et Face, Ca vous regarde, Entre les lignes (LCP) TV amusements - 1h 05m 1h 05m La place du village (TV8) 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 26 Train Dev Eval Item length 26h 10h 55m 10h 55m Nb files 44 15 15 Nb words 290517 91656 115511 Nb Named Entities 46763 14398 13055 Nb unique categories 33 33 33

NERD @ ETAPE (naïve combined strategy) (eA1,tA1,URIA1,siA1,eiA1) ......... ` (eA2,tA2,URIA2,siA2,eiA2) (eA3,tA3,URIA3,siA3,eiA3) (eN2,tN2,URIN2,siN2,eiN2) (eN1,tN1,URIN1,siN1,eiN1) extraction cleaning fusion When at least 2 extractors classify the same entity with a different type then we apply a preferred selection order (empirically defined): Wikimeta, AlchemyAPI, OpenCalais, Lupedia 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 27

Participation at ETAPE (combined+ strategy) (eA1,tA1,URIA1,siA1,eA1 ) ` (eA2,tA2,URIA2,siA2,eiA2 ) (eN2,tN2,URIN2,sN2,eN2) (eN1,tN1,URIN1,sN1,eN1) ... ETAPE Train & Dev Learned model Created static rules fusion Conflicts handled by priority selection: own, Wikimeta,AlchemyAPI, OpenCalais,Lupedia POS tagger Apply rules (e1,t1,URI1,si1,ei1) 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 28

NERD Global results SLR Precision Recall F-measure %correct combined 86.85% 35.31% 17.69% 23.44% 17.69% combined+ 188.81% 15.13% 28.40% 19.45% 28.40% Combined+ : Eval corpus differs substantially from the Train & Dev corpora. The static rules do not fit well the Eval corpora and they introduce classification noise. 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 29

Per-extractor results SLR Precision Recall F-measure %correct alchemyapi 37.71% 47.95% 5.45% 9.68% 5.45% lupedia 39.49% 22.87% 1.56% 2.91% 1.56% opencalais 37.47% 41.69% 3.53% 6.49% 3.53% wikimeta 36.67% 19.40% 4.25% 6.95% 4.25% combined (nerd) 86.85% 35.31% 17.69% 23.44% 17.69% combined+ (nerd+) 188.81% 15.13% 28.40% 19.45% 28.40% 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 30

- 3111/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013

Learning How to Combine NER Extractors 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 32

NERD on CoNLL 2003 (NER task) 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 33

NERD on MSM 2013 (NER task) 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 34

NERD on MSM 2013 (NEL task) 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 35

Media Fragment Enricher: http://mfe.synote.org/mfe/ 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 36

Linking pieces of knowledge 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 37

Linking pieces of knowledge 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 38

Named Entities for Video Classification 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 39

Workflow 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 40 Media Fragment Enricher Services Media Fragment Enricher UI Metadata & timed-text NERD Client RDFizator Triple Store Categori- zation Video and metadata preview Video replay with subtitles and aligned NEs 1: Video URL 2: Metadata 3: meta- data 4:NERDify 5:Timed Text 6: NEs with time alignment (json) 7: RDFize (ttl) 8: Generate Category 9: SPARQL query

Channel signature based on NE distribution 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 41

Media Collector  Composition of media item extractors (12 SNs)  Rely on search APIs + a fix 30s timeout window to provide results  Fallback on screen scraping when necessary (Twitter ecosystem)  Implemented as a NodeJS server  Serialize results in a common schema (JSON) Semantics at the multimedia fragment level - SSSW, Cercedilla, July 201311/07/2013 - - 42

11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 43 Deep link Permalink Clean text for NLP processing Aggregate view of ALL social interactions 12 Social Networks

Media Finder (www2013) 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 44

Media Finder (zooming on media items) 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 45

Media Finder (timeline view) 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 46

Media Finder Architecture  Media items harvesting using the Media Server http://eventmedia.eurecom.fr/media- server/search/{combined}/{term} https://github.com/vuknje/media-server (@tomayac fork)  Image near de-duplication DCT signature on image and video frame, Hamming distance between image pairs  Clustering and disambiguation Named Entity Extraction using NERD Topic Generation using LDA 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 47

Media Finder (named entities clustering) 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 48

Media Finder (zooming in a cluster) 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 49

Media Finder: http://mediafinder.eurecom.fr/  Live Topic Generation from Event Streams WWW 2013 Demo Session http://www.youtube.com/watch?v=8iRiwz7cDYY 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 50

Tracking an event: Italian Election  Repeated queries over a period of time We have tracked and analyzed media posts tagged as elezioni2013 from 2013-02-26 to 2013-03-03 Cron job: every 30 minutes over the 6 days Slice the data in 24 hours slots  Research questions: Can we re-create the news headlines?  Storyboarding: http://mediafinder.eurecom.fr/story/elezioni2013 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 51

Tracking an event: Italian Election  Dataset: ~16501 microposts containing (duplicate) media items ~21087 Named Entities extracted  Clustering NER and LDA Generate Bag of Entities (BOE) disambiguated with a DBpedia URI  Examples: Monti, Bersani, Italia, Berlusconi, Grillo, Stelle 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 52

Tracking an event: Italian Election  Tracking and Analyzing The 2013 Italian Election ESWC 2013 Demo Session http://www.youtube.com/watch?v=jIMdnwMoWnk 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 53

Multimedia and Semantic Web  Different Ecosystems:  Local identifiers  Specific metadata formats  Huge amount of Multimedia Content  Low number of links between content 11/07/2013 - - 54Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013

Multimedia and Semantic Web  Universal Identifiers: URI’s  Common description formats  Easy interlinking between content 11/07/2013 - - 55Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013

Media Fragments nerd:Location Cafe Rick Nerd:Person H. Bogart Nerd:Person I. Bergman nerd:Location Casablanca  Media Fragment URI 1.0  Chapters  Scenes  Shots  etc… http://data.linkedtv.eu/medi a/e2899e7f#t=14,15  LinkedTV Ontology 11/07/2013 - - 56Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013

Hypervideo nerd:Location Cafe Rick Nerd:Person H. Bogart Nerd:Person I. Bergman nerd:Location Casablanca Nerd:Person E. Tierney nerd:Location China 11/07/2013 - - 57Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013

Web + TV experience http://www.youtube.com/watch?v=4mSC685AG7k 11/07/2013 - - 58Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013

Research Vision (context)  Knowledge Graphs everywhere  Google Knowledge Graph, Microsoft Entity Graph, Yahoo! Web of Things, Wikidata  Open Data, Structured Data, Linked Data  The rise of social media  Events happen all the time and are the topic of social network conversations, also in form of event-related multimedia data  Videos and photos are (re-)shared on multiple social networks  Events can be planned or unplanned 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 59 (Read the background story http://www.washingtonpost.com/about-those-2005-and-2013-photos-of-the- crowds-in-st-peters-square)

Research Vision (opportunity)  Video is a first class citizen on the Web Annotations: Ontology and API for Media Resources Access: Media Fragments URI NERD platform for extracting key information from learning resources including videos  The Linked Media vision Extracting semantic knowledge from social media Collect, enrich and visualize media memes shared by the crowd Generate visual stories about what is happening in the world (summarization) 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 60

Winter School: http://winterschool.mediamixer.eu/ 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 61

Credits  Giuseppe Rizzo, Vuk Milicic, José Luis Redondo Garcia (EURECOM)  Thomas Steiner (Google Inc.)  Marieke van Erp (Free University of Amsterdam)  Yunjia Li (University of Southampton)  … and many other students 11/07/2013 - Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013 - 62

http://www.slideshare.net/troncy 11/07/2013 - - 63Semantics at the multimedia fragment level - SSSW, Cercedilla, July 2013

Add a comment

Related presentations

Presentación que realice en el Evento Nacional de Gobierno Abierto, realizado los ...

In this presentation we will describe our experience developing with a highly dyna...

Presentation to the LITA Forum 7th November 2014 Albuquerque, NM

Un recorrido por los cambios que nos generará el wearabletech en el futuro

Um paralelo entre as novidades & mercado em Wearable Computing e Tecnologias Assis...

Microsoft finally joins the smartwatch and fitness tracker game by introducing the...

Related pages

SSSW 2013 (@SSSW2013) | Twitter

The latest Tweets from SSSW 2013 (@SSSW2013). 10th Summer School on Ontologie Engineering and the ... "Semantics at the multimedia fragment level", ...
Read more

Exploring application level semantics - Education

Exploring application level semantics; Exploring application level semantics May 18, 2015 Education ingenioustech. System is processing data
Read more

www.rhiaro.co.uk

www.rhiaro.co.uk
Read more

Not A Fragment | LinkedIn

Not A Fragment. Articles, experts, jobs, and more: ... and it is. In a recent fragment screening made by Sanofi, MST was used in an... View More View Less.
Read more