carol peters

50 %
50 %
Information about carol peters

Published on December 18, 2007

Author: Elodie


Multilingual Information Access for Digital Libraries :  Multilingual Information Access for Digital Libraries Carol Peters ISTI-CNR, Pisa MLIA - The Problem:  MLIA - The Problem Increasing pressure for access to information without language or cultural barriers means there is a strong demand to be able to: Find information in foreign languages Read and interpret that information Merge it with information in other languages Need for Multilingual Information Access Global Information Society:  Global Information Society WWW as platform for knowledge dissemination Distance Learning….. Digital Libraries….. information providers and seekers should have equal opportunities preservation of national languages WWW and Internet:  WWW and Internet Internet is no longer monolingual and non-English content is growing rapidly User profile has changed radically From primarily academic use to widespread commercial, leisure, educational, entertainment etc. uses Slide5:  78% of Internet Users will be Non-English Speaking by 2005 Confidential, unpublished information Manning & Napier Information Services 2000 Evolution of non-English speaking population:  Evolution of non-English speaking population Slide8:  Widely Spoken Languages What is MLIA?:  What is MLIA? MLIA related research regards the storage, access, retrieval and presentation of information in any of the world's languages. Two main areas of interest: multiple language access, browsing, display cross-language information discovery and retrieval Multi-Language Access, Browsing, Display:  Multi-Language Access, Browsing, Display The enabling technology: character encoding specific requirements of particular languages and scripts localization and presentation Cross-Language Information Retrieval:  Cross-Language Information Retrieval Crossing the language barrier… querying of multilingual collection in one language against documents in many other languages… filtering, selecting, ranking retrieved documents presenting retrieved information in an interpretable and exploitable fashion MLIA and Digital Libraries:  MLIA and Digital Libraries The neglected problem! It’s hard! It’s resource demanding! There are other issues to solve! BUT – everyone agrees - it’s important! MLIA is Resource Demanding:  MLIA is Resource Demanding Multilingual Portals How many languages / how many levels should be multilingual / how to handle updates Monolingual Search for Multiple Languages Encoding and representation issues / indexing issues (stop words, stemmers, morphological analysers ..) Cross-Language Search translation resources (dictionaries, corpora, MT systems) Presentation of Results in form exploitable by user Digital Library Projects in 5FP:  Digital Library Projects in 5FP 14 projects contained collections in multiple languages 4 had not considered any kind of multiple language processing (all text=English) 10 monolingual retrieval functionality for all languages 1 had implemented cross-browsing of collections using common metadata schema 6 had some kind of basic cross-language functionality: 5 used multilingual controlled vocabulary / thesaurus 1 used bilingual dictionary search 1 used pseudo relevance feedback (in addition to thesaurus) 1 proposed using similarity search (in addition to controlled vocab) ETRDL:  ETRDL multilingual interfaces (6 languages) choice of interface language select language of document collection multiple language text processing SCHOLNET:  SCHOLNET ETRDL plus cross-language search functionality Multilingual thesaurus mechanisms for thesaurus maintenance and update Free-text search on abstracts via pseudo-relevance feedback ECHO:  ECHO Film archives in 4 languages cross-language search via controlled vocabulary experimental corpus-based approach on speech recognition output Slide18:  MUCHMORE Project for CLIR in Medical Domain DELOS supports CLEF:  DELOS supports CLEF Cross-Language Evaluation Forum mono-, bi- and multilingual textual document retrieval on news collections (Ad Hoc) mono- and cross-language information on structured scientific data (Domain-Specific) interactive cross-language retrieval (iCLEF) multiple language question answering (QA@CLEF) cross-language retrieval in image collections (ImageCLEF) cross-language spoken document retrieval (CL-SR) multilingual retrieval of Web documents (WebCLEF) cross-language geographical retrieval (GeoCLEF) The Challenge:  The Challenge Bridge the Gap between research and application Transfer research results to real world Make existing resources and methodology generally available Raise awareness What should we have now:  What should we have now Multilingual portals Support monolingual search in multiple languages Character encoding issues / stopword lists / stemmers / morphological analysers Support simple cross-language search multilingual metadata interlingua or pivot language thesauri for domain-specific search Existing DL Software Systems:  Existing DL Software Systems Some kind of multilingual support D-Space Greenstone Open-DLib NSDL Cross-language functionality Cheshire Cheshire Interface:  Cheshire Interface Digital Library Projects in 6FP:  Digital Library Projects in 6FP DILIGENT (DL infrastrucutre on GRID) BRICKS (DLMS for CH) Translation manager Dictionary-based translation Accepts any language (in theory) Query translated to language of collections being searched Interactive stage Metadata search Results translated into user preferred language The Future: A Targeted Multilingual/Multimedia Search Engine:  The Future: A Targeted Multilingual/Multimedia Search Engine Problem: The Web contains a wealth of fragmented CH information but users are left to discover, interpret and aggregate it. Objective: Provide targeted, enriched access to heterogeneous CH objects across all media types and language boundaries supporting various user classes with aggregate views on complex task scenarios Assist CH institutions to raise visibility and disseminate content Challenges: From document to complex objects retrieval CH concept and relation extraction integration and representation of related objects presentation of aggregate search results focused crawling for acquisition of CH-related information from heterogeneous MM resources Slide27:  MULTI MATCH Museums Databases Web Resources: Museums Libraries Archives Newspapers Newsagencies Personal Pages Blogs crawling acquisition His life (1853-1890), … Paintings Other expressionists Exhibitions Milano, … Critical reviews Van Gogh In the Meantime:  In the Meantime Recognise the problem – do as much as you can – keep it simple and remember that MLIA = Interoperability Standards Unicode ( Multilingual Dublin Core RDF Encoding of Multilingual Thesauri OWL (Web Ontology Language)

Add a comment

Related presentations

Related pages

Carol Peters Travel - Coach Day Excursions, Coach Holidays ...

Carol Peters Travel, Coach Holidays, Coach Day Excursions, Coach Theatre Trips and Private Coach Hire for clubs, schools and organisations from Ramsgate ...
Read more

Carol Peters – Wikipedia

Carol Ann Peters (* 1932) ist eine ehemalige US-amerikanische Eiskunstläuferin, die im Eistanz startete. Ihr Eistanzpartner war Daniel Ryan. Mit ihm ...
Read more

Carol Peters Profiles | Facebook

View the profiles of people named Carol Peters on Facebook. Join Facebook to connect with Carol Peters and others you may know. Facebook gives people the...
Read more

Top 25 Carol Peters profiles | LinkedIn

View the profiles of professionals named Carol Peters on LinkedIn. There are 361 professionals named Carol Peters, who use LinkedIn to exchange information ...
Read more

Carol Peters | Facebook

Carol Peters is on Facebook. Join Facebook to connect with Carol Peters and others you may know. Facebook gives people the power to share and makes the...
Read more

Carol Peters Travel (@carol_peters) | Twitter

The latest Tweets from Carol Peters Travel (@carol_peters). Coach Travel Company. We have our own day trip & holiday brochure. We have 49 & 32 seater ...
Read more

Carol Peters | LinkedIn

View Carol Peters's professional profile on LinkedIn. LinkedIn is the world's largest business network, helping professionals like Carol Peters discover ...
Read more

Carol Peters (@carolpeters) | Twitter

The latest Tweets from Carol Peters (@carolpeters). elder, poet, translator, naturalist, geek. Santa Cruz, CA
Read more

Carol Peters

Carol Peters is a jazz pianist who enjoys creating her own arrangements of traditional and contemporary worship songs, hymns, gospel songs and Christmas ...
Read more

Carol Peters auf Pinterest

Schau dir an, was Carol Peters (cmp0612) auf Pinterest, der weltweit größten Sammlung der beliebtesten Dinge, entdeckt hat.
Read more