75 %
25 %
Information about Thursday 15 13 00h RODRIGO SANCHEZ JIMENEZ

Published on December 4, 2007

Author: Jancis


Thesauri managing and Software Agents: A Proposed Architecture:  Thesauri managing and Software Agents: A Proposed Architecture José Ramón Pérez Agüera Departamento de Sistemas Informáticos y Programación Universidad Complutense de Madrid. Rodrigo Sánchez Jiménez Departamento de Biblioteconomía y Documentación. Universidad Complutense de Madrid. Thesauri managing and Software Agents: A Proposed Architecture :  Thesauri managing and Software Agents: A Proposed Architecture Paper Objectives To present a basic architecture that allows us the use of thesauri for Information Retrieval tasks in a distributed environment. To present a detailed example of an implementation of such architecture. José Ramón Pérez Agüera / Rodrigo Sánchez Jiménez Intro: some explanations Thesaurus definition:  Intro: some explanations Thesaurus definition Thesaurus in a Library and Information Sciences sense Controlled vocabulary, speciallized application, conceptual expressions semantically related… These are not dictionaries (Rodget and the like). Very common linguistic resources for NLP. Main functions Lexical normalization Knowledge Induction Concept representation Intro: some explanations Thesaurus sctructure:  Intro: some explanations Thesaurus sctructure We are not dealing with KWIC or KWOC indexes Thesaurus inner semantic relationships Equivalence relationships (preferred and not preferred terms) Hierarchical relationships (narrower and broader terms) Associative relationships (related terms) The use of thesauri:  The use of thesauri What for: Query Expansion Word Disambiguation Translingual Information Retrieval Used in Automated Categorization Why using Thesauri and not other knowledge systems? There are lots of them, already made, very well known structure. Simple to use, simple to automaticaly mark-up. Our Goals:  Our Goals Work on improving the use of thesauri on the Internet. Think of the Internet as an increasingly distributed environment. Adapt the use of thesauri to such distrubuted environment. Use standard, Semantic Web oriented technologies in the process. Architecture: Main issues:  Architecture: Main issues Defining an information service that allows the use of thesauri in a distributed environment. Designing an application that acts as an interface between the thesaurus and a third application. Do this in a Web Service way. Non-distributed architecture:  Non-distributed architecture Integrated architecture All tasks use the same thesauri and collection No external knowlege is used unless brought into the system Use of these particular thesauri by third applications is very difficult Distributed architecture:  Distributed architecture The same agent can manage several thesauri. It can offer them to external applications. There is not a need for maintaining a single thesaurus for every collection. Initial requirements:  Initial requirements Machine-readable encoded thesauri Access and query functionalities Communication capabilities Thesaurus Mark-up Models:  Thesaurus Mark-up Models There have been several proposals on thesaurus mark-up RDF based CERES SKOS-Core Others Topic Maps Zthes Non RDF models:  Non RDF models Zthes Main goal is to allow for implementation of systems that access thesauri through Z39.50 and SWR protocols. Topic Maps Standard for conceptual browsing. Recently created self DTD. RDF-based models: CERES:  RDF-based models: CERES Prior to SKOS-Core Based upon the Z39.19 standard Main goal is to allow for inter-application thesauri exchange. Uses RDF but is not tuned to OWL. Not being used anymore. RDF-based models: SKOS-Core:  RDF-based models: SKOS-Core Main goal of SKOS-Core is to migrate knowledge organization systems to the Semantic Web Environment. Gives us a basic framework for concept scheme generation but without the complexities of OWL. It’s easy to use, which will contribute to its adoption by lots of users. Seems to have a promising future. SKOS-Core I:  SKOS-Core I This RDF schema allows us for Concept and Concept Scheme representation. One can easily think of a thesaurus as a colection of concepts and use SKOS to modell this collection and its inner structure. It shows a concept oriented description of the thesaurus in spite of a term oriented description. SKOS-Core II Modelling a thesaurus:  SKOS-Core II Modelling a thesaurus Creating a concept scheme with skos:ConceptScheme <rdf:RDF xmlns:rdf="" xmlns:rdfs="" xmlns:skos="" xmlns:dc=""> <skos:ConceptScheme rdf:about="http:/"> <dc:title>SPINES</dc:title> <dc:description>Tesauro de política científica</dc:description> <dc:creator>UNESCO</dc:creator> </skos:ConceptScheme> </rdf:RDF> SKOS-Core III Modelling a thesaurus:  SKOS-Core III Modelling a thesaurus Creating a concept with skos:Concept <rdf:RDF xmlns:rdf="" xmlns:skos=""> <skos:Concept rdf:about="http:/"> <skos:inScheme rdf:resource="http:/"/> </skos:Concept> </rdf:RDF> Lexical and conceptual levels are kept appart. Terms to referr to the concept will be added later. SKOS-Core IV Adding terms to link up with concepts:  SKOS-Core IV Adding terms to link up with concepts This is done by using skos:prefLabel for descriptors skos:altLabel for non-descriptors <rdf:RDF xmlns:rdf="" xmlns:skos=""> <skos:Concept rdf:about="http:/"> <skos:prefLabel>Capital</skos:prefLabel> <skos:altLabel>Activo</skos:altLabel> <skos:altLabel>Riqueza</skos:altLabel> <skos:inScheme rdf:resource="http:/"/> </skos:Concept> </rdf:RDF> SKOS-Core V Adding hierarchical relations:  SKOS-Core V Adding hierarchical relations Here we can various types of relations <rdf:RDF xmlns:rdf= xmlns:skos=""> <skos:Concept rdf:about=""> <skos:prefLabel>Eritrocitos</skos:prefLabel> <skos:altLabel>Glóbulos rojos</skos:altLabel> <skos:altLabel>Hematíes</skos:altLabel> <skos:inScheme rdf:resource="http:/"/> <skos:broader rdf:resource="http:/"/> </skos:Concept> <skos:Concept rdf:about=""> <skos:prefLabel>Sangre</skos:prefLabel> <skos:altLabel>Plasma</skos:altLabel> <skos:altLabel>Suero sanguíneo</skos:altLabel> <skos:inScheme rdf:resource="http:/"/> <skos:narrower rdf:resource="http:/"/> </skos:Concept> </rdf:RDF> SKOS-Core V Adding associative relations:  SKOS-Core V Adding associative relations <rdf:RDF xmlns:rdf= xmlns:skos=""> <skos:Concept rdf:about=""> <skos:prefLabel>Eritrocitos</skos:prefLabel> <skos:altLabel>Glóbulos rojos</skos:altLabel> <skos:altLabel>Hematíes</skos:altLabel> <skos:inScheme rdf:resource="http:/"/> <skos:related rdf:resource="http:/"/> </skos:Concept> <skos:Concept rdf:about=""> <skos:prefLabel>Sangre</skos:prefLabel> <skos:altLabel>Plasma</skos:altLabel> <skos:altLabel>Suero sanguíneo</skos:altLabel> <skos:inScheme rdf:resource="http:/"/> <skos:related rdf:resource="http:/"/> </skos:Concept> </rdf:RDF> Thesaurus managing agent:  Thesaurus managing agent Its meant to act as a thesaurus server Has some core accessing functions provided by Concept and Thesaurus classes. Has functions meant to solve basic operations of the agent, provided by NormalizeTerm and Searcher classes. Thesaurus managing agent II Thesaurus class:  Thesaurus managing agent II Thesaurus class This class abstracts and models a thesaurus Thesauri are represented in a Concept oriented way. We used maps as to guarantee maximal coherence Core data is stored in the Concept class Thesaurus are modeled according to ISO 2788 standard Thesaurus managing agent III Thesaurus class II:  Thesaurus managing agent III Thesaurus class II This class is serializable in native form, although database storage is being considered. Main functionalities Searching a concept by any entrance point. Deleting concepts and creating new concepts. Thesaurus managing agent IV Concept class:  Thesaurus managing agent IV Concept class This class stores data about: Preferred labels Non preferred labels Related terms Broader and narrower terms Scope notes These data can be retrieved i.e. get scope note… These data can be modifyied, changing any value as thesaurus maintenance requires it. Thesaurus managing agent V Basic Functionalities:  Thesaurus managing agent V Basic Functionalities There are two classes providing basic functionalities: NormalizeTerm Search They allow us for basic operations so as to being able to do something more than modelling and handling the thesaurus. Thesaurus managing agent VI NormalizeTerm class:  Thesaurus managing agent VI NormalizeTerm class This class allows for easy term normalization against the thesaurus. Any string of words submitted to this class results in the normalized term (descriptor) by which the thesaurus preferedly represents the word’s inner concept. This proccess uses every possible entrance (key of the thesaurus map) as to guarantee successfull retrieval. Thesaurus managing agent VII Searcher class:  Thesaurus managing agent VII Searcher class This class is intended to retrieve a complete SKOS document representing a determined concept. We submit a query and retrieve every related, narrower or broader terms, appart from scope notes and the like. jose@leviathan:/eclipse/workspace/ThesaurusAgent$ java thes.Searcher cáncer This process is intended to be an example of the multiple uses that can be given to the thesaurus. Information present in this SKOS document can be easily used for transligual information retrieval or query expansion tasks. Thesaurus managing agent VIII Searcher class (example of use):  Thesaurus managing agent VIII Searcher class (example of use) jose@leviathan:/eclipse/workspace/ThesaurusAgent$ java thes.Searcher cáncer <rdf:RDF xmlns:rdf="" xmlns:skos=""> <skos:Concept rdf:about="http://spines/neoplasmas%20malignos"> <skos:broader rdf:resource="http://spines/enfermedades"/> <skos:related rdf:resource="http://spines/transformación%20neoplásica%20celular"/> <skos:prefLabel>neoplasmas malignos</skos:prefLabel> <skos:prefLabel xml:lang="en">malignant neoplasms</skos:prefLabel> <skos:prefLabel xml:lang="fr">neoplasmes malins</skos:prefLabel> <skos:related rdf:resource="http://spines/neoplasmas%20benignos"/> <skos:related rdf:resource="http://spines/hábito%20de%20fumar"/> <skos:related rdf:resource="http://spines/enfermedades%20incurables"/> <skos:altLabel>cáncer</skos:altLabel> <skos:altLabel>carcinoma</skos:altLabel> <skos:related rdf:resource="http://spines/i+d%20médica"/> <skos:related rdf:resource="http://spines/neoplasmas%20experimentales"/> <skos:related rdf:resource="http://spines/pechos"/> <skos:narrower rdf:resource="http://spines/neoplasmas%20inducidos%20por%20radiación"/> <skos:related rdf:resource="http://spines/antineoplásicos"/> <skos:related rdf:resource="http://spines/enfermedades%20de%20la%20mama"/> <skos:related rdf:resource="http://spines/enfermedades%20gastrointestinales"/> <skos:broader rdf:resource="http://spines/neoplasmas"/> <skos:related rdf:resource="http://spines/condiciones%20precancerosas"/> <skos:related rdf:resource="http://spines/enfermedades%20ginecológicas"/> <skos:related rdf:resource="http://spines/cancerígenos%20ambientales"/> <skos:related rdf:resource="http://spines/enfermedades%20del%20aparato%20genital"/> <skos:related rdf:resource="http://spines/amianto"/> <skos:narrower rdf:resource="http://spines/leucemias"/> <skos:narrower rdf:resource="http://spines/sarcoma"/> </skos:Concept> </rdf:RDF> Thesaurus managing agent IX Further capabilities:  Thesaurus managing agent IX Further capabilities The agent learns Keeps track of terms more commonly used in queries to the system so as to make them preferred terms for a determined concept. Automatically adapts thesauri to the changes implied in other agents or user queries, prunning, or adding elements as required. Other techniques can be applied Automatic thesaurus generation techniques Automatic cohesion evaluation techniques And so on… Communicating with other applications:  Communicating with other applications To achieve the “distributed environment goal” we need to use communication standards. We need two communication protocols Agent communication protocol Web Service protocol Communicating with other applications FIPA-RDF:  Communicating with other applications FIPA-RDF FIPA is an agent oriented protocol. FIPA-RDF specification allows for inter-agent RDF interchange. It’s meant to stablish a communication channel beteen a thesaurus manager agent and a indexing, classifying, searcher… agent. The manager agent might send a response including an RDF document with all necesary information. Communicating with other applications SOAP:  Communicating with other applications SOAP Classical web services that offer access to information are very well known. Other services that offer access to functions are less known. Google’s SOAP service is a good example. These web services are meant to enhace the working possibilities of third applications. These services allow for further specialization, and can do an important part of the work. SOAP is a W3C recommendation, so it looks like a good choice. (And its fully developed and implemented) CONCLUSIONS:  CONCLUSIONS The proposed architecture is an example of the implementation of the principles behind the Semantic Web. We think that direct use of already available resources such as thesauri can be a way of quickly making real practical applications of the Semantic Web, and contribute to its populariztion. Working in a distributed environment means specialization and resource optimization. Making thesauri available for direct use for both users and applications would save a lot of work in concurrent digitising initiatives.

Add a comment

Related presentations

Related pages

BABYLON - Filmtheather, Komunales Kino, Filme, Stummfilme ...

THURSDAY / DONNERSTAG / JUEVES 15. OCT 2015 : 18:00 ... Rodrigo Sepúlveda, Chile 2014, ... Dienstag 13. Oktober, 20:00h Samstag 17. Oktober, ...
Read more

Scientific Programme - XV SEOM Congress - Madrid 2015

13 08:30-10:00h 07:30-08:30h ... 15-21:00h. THURSDAY 29 OCTOBER 2015 15th SEOMCONGRESS 2015 ... Dr. Rodrigo Dienstmann.
Read more

Programa Inglés (PDF) - TEAM 2014 -

Programa final pdf - XX Congreso de la Sociedad Española de Ciencia pdf 13 323 KB Iniciar sesión; Crear una nueva cuenta; Iniciar sesión ...
Read more


Download "SCIENTIFIC PROGRAMME. Foundation" ... 9 WEDNESDAY 13:00-13:55h 14:00-15:30h 15:35-17:00h ON TREATMENT OF STAGE I ... 13 THURSDAY 29 OCTOBER ...
Read more

Actividades | consonni

Aida Sanchez de Serdio ... (C/ Conde Mirasol 13, Bilbao) next Thursday 15 October at 19.30. ... Thursday 21st March. 19:00h at consonni ...
Read more

Notícies d'actualitat del Departament de Dret - Department ...

From Thursday 15 to Friday 16 of ... On Friday, April 22nd, Ángel J. Rodrigo Hernández ... On tuesday, October 13, Lucia Busatta, Professor of ...
Read more

Accuracy 2012 » Program Time Schedule

12 th Thursday. 13 th Friday ... 11:30-13:30. Lunch. Lunch. Lunch 13:30-15:00. Workshop. ... Rodrigo de Campos Macedo, ...
Read more

24/7 Valencia #177 by daniel lopez - issuu

24/7 Valencia is the definitive English Speaking guide to Valencia. Extensive Listings, up-to-date and informed articles on restaurants, chill out ...
Read more

Taller de Altas Energías 2013 - Centro de Ciencias de ...

Taller de Altas Energías 2013. 2013, Sep 15 -- Sep 28. ... T. Rodrigo - IFCA: 15:00h: ... Javier Jimenez: 19:00h:
Read more

Querido amigo/a -

Esther Saiz Rodrigo Dña. Pilar Cano ... THURSDAY, November 12 ... November 13, 2015 Tarde Afternoon 15:30-17:00h SÍNDROME DE HIPOVENTILACIÓN ...
Read more