INEX2006 CSIRO

60 %
40 %
Information about INEX2006 CSIRO
Entertainment

Published on November 2, 2007

Author: Kiska

Source: authorstream.com

CSIRO’s participation in INEX 2006:  CSIRO’s participation in INEX 2006 Alexander Krumpholz and David Hawking 18. Dec. 2006 Outline:  Outline Our topics Our architecture Our runs Results to our runs Other projects InexBib MyInstantExpert Topic: Bridge types:  Topic: Bridge types title: bridge types castitle: //*[about(.,bridge types)] description: Which types of bridges exist? narrative: A user wants to learn about different construction types of bridges, like a suspension bridge, bascule bridge, etc. While results about a particular bridge like the Sydney Harbour Bridge is not of special interest, it is considered a helpful result if the construction type of the particular bridge is mentioned in the text, since it allows the user to formulate a more specific query. Topic: Rhinoplasty:  Topic: Rhinoplasty title: rhinoplasty castitle: //article[about(.,rhinoplasty)] description: What is rhinoplasty? narrative: A person unhappy with his current look considers getting a 'nose job' done. He wants to find all information a potential patient could possibly get to help him make a decision. He would like to find out about past and current procedures, risks, etc. Also knowing of other people having had a rhinoplasty and their reasons and results might help him making up his mind. Since it is medical topic with major impact and risk for the patient, information about related operations (e.g. other plastic surgery) are considered relevant for the user as well. Topic: Book - Architecture:  Topic: Book - Architecture title: book architecture castitle: //template[about(.//@name,book reference)]//*[about(.,architecture)] description: Show me books about architecture narrative: After coming home from a trip to Venice, the user is intrigued to read more about architecture and city planning. He wants to find books about architecture and is therefore looking for references including the author and title. The user is not interested in landscaping and even less in computing. Assessment:  Assessment What did we get back Types of bridges… Dental Wire bridge bombs Soldering bridges Card games Types of… trees cars XML retrieval:  XML retrieval XML retrieval using Padre Plain Text retrieval search engine Splitting on elements to get small ‘documents’ Retrieving snippets Post processing Architecture:  Architecture INEX topics INEX collection xml.cfg PADRE Metadata classes - xml.cfg:  PADRE Metadata classes - xml.cfg document,//DOC a,1,,//DOC/article b,1,,/body e,1,,/caption f,1,,/p g,1,,/figure j,1,,/title l,1,,/table n,1,,/template o,1,,/section p,1,,/template@name q,1,,/collectionlink r,1,,//DOC/article/name w,1,,/link x,0,,/DOCNO BUT: classes are disjunct! Padre queries:  Padre queries NEXIQUERY: //article//figure[about(., Renaissance painting Italian Flemish -French -German)] PADREQUERY: g:Renaissance g:painting g:Italian g:Flemish g:-French g:-German NEXIQUERY: //article[about(.,wifi)]//section[about(.,wifi security encryption)] PADREQUERY: a:wifi o:wifi o:security o:encryption run.xml:  run.xml <inex-submission participant-id="22" run-id="CSIRO-CO1-B" task="Focused“ query="automatic"> <topic-fields title="yes" castitle="no" description="no" narrative="no" ontopic_keywords="no"/> <description>Using the title as a query to padre, but supressing overlapping results </description> <processing-instructions element-restriction="None" suppress="Overlap" padre-blocksize="1600" > <query>query.padre_title.uniq.join(' ') </query> </processing-instructions> <collections> <collection>wikipedia</collection> </collections> </inex-submission> Retrieval Task:  Retrieval Task A) Thorough Task Ranked elements B) Focused Task Ranked elements (no overlap) C) All In Context Task Ranked documents with matching elements (no overlap) D) Best In Context Task Ranked documents with entry point Our runs:  Our runs CSIRO-CAS1-A Q: Content-And-Structure-Title R: results as they are retrieved. CSIRO-CAS2-A Q: Content-And-Structure-Title R: restricted to match the query. CSIRO-CO1-A Q: Content-Only-Title R: results as they are retrieved. CSIRO-CO1-B Q: Content-Only-Title R: suppressing overlapping results. CSIRO-CO1-D Q: Content-Only-Title R: restricting results to articles. CSIRO-CO2-A Q: Content-And-Structure-Title without structural references, R: results as they are retrieved. CSIRO-CO2-B Q: Content-And-Structure-Title without structural references, R: suppressing overlapping results. CSIRO-CO2-D Q: Content-And-Structure-Title without structural references R: restricting results to articles. <title> versus <castitle>:  <title> versus <castitle> <title>proprietary implementation +protocol +wireless +security</title> <castitle>//article[about(., wireless)and about(.//p, security)] // link[about(., proprietary +implementation) ]</castitle> Results A) Thorough:  Results A) Thorough MAep 1 0.0384 2 0.0382 3 0.0378 4 0.0372 5 0.0372 35 CSIRO_CO2_A 0.0167 cas-title without structure 37 CSIRO_CO1_A 0.0165 co-title 78 CSIRO_CAS1_A 0.0066 cas-title/ as they are retrieved. 81 CSIRO_CAS2_A 0.0059 cas-title/ restricted to match the query. 106 0.0000 Results B) Focused with overlap on:  Results B) Focused with overlap on nxCG@5 1 0.3961 2 0.3891 3 0.3769 4 0.3723 5 0.3689 19 CSIRO_CO1_B 0.3341 22 CSIRO_CO2_B 0.3323 85 0.0000 nxCG@50 1 0.2265 2 0.2260 3 0.2224 4 0.2136 5 0.2129 23 CSIRO_CO2_B 0.1674 28 CSIRO_CO1_B 0.1648 85 0.0000 CO1 – title CO2 – cas title without structure B - supressing overlaps Results B) Focused with overlap off:  Results B) Focused with overlap off nxCG@5 1 0.4708 2 0.4292 3 0.4176 4 0.4066 5 0.3999 30 CSIRO_CO1_B 0.3361 31 CSIRO_CO2_B 0.3360 85 0.0000 nxCG@50 1 0.2802 2 0.2754 3 0.2684 4 0.2648 5 0.2623 39 CSIRO_CO2_B 0.1643 41 CSIRO_CO1_B 0.1614 85 0.0000 CO1 – title CO2 – cas title without structure B - supressing overlaps Results D) BestInContext:  Results D) BestInContext At A=0.01 1 0.1959 2 0.1722 3 0.1630 4 0.1621 5 0.1614 65 CSIRO-CO2-D 0.0520 66 CSIRO-CO1-D 0.0514 77 0.0000 At A=100.0 1 0.7983 2 0.7957 3 0.7900 4 0.7701 5 0.7683 62 CSIRO-CO2-D 0.4326 63 CSIRO-CO1-D 0.4264 77 0.0002 CO1 – title CO2 – cas title without structure D - returning articles Metric: BEPD Results D) BestInContext:  Results D) BestInContext At A=0.01 1 0.0407 2 0.0325 3 0.0314 4 0.0304 5 0.0301 41 CSIRO-CO1-D 0.0166 43 CSIRO-CO2-D 0.0161 77 0.0000 BEPD At A=100.0 1 0.3146 2 0.3122 3 0.3073 4 0.3059 5 0.3008 45 CSIRO-CO2-D 0.1942 46 CSIRO-CO1-D 0.1898 77 0.0000 CO1 – title CO2 – cas title without structure D - returning articles Metric:EPRUM-BEP-Exh-BEPDistance Thank you:  Thank you Questions?

Add a comment

Related presentations

Related pages

dblp: INitiative for the Evaluation of XML Retrieval (INEX ...

Bibliographic content of INitiative for the Evaluation of XML Retrieval (INEX) 2006
Read more

David Hawking’s Publications

David Hawking’s Publications Chief Scientist, ... INEX2006, volume 4518 of ... TREC 14 enterprise track at csiro and anu. In Proceedings of TREC-2005.
Read more

Category:Searching - NYU Computer Science

... '''David Hawking''', Science Leader, Project Leader, [http://es.csiro.au/people/Dave/ CSIRO ICT Centre] '''Noriko Kando''', Professor, ...
Read more

Robert Gordon University at INEX 2006: Adhoc Track ...

Robert Gordon University at INEX 2006: Adhoc Track. 426 Pages. Robert Gordon University at INEX 2006: Adhoc Track. Uploaded by. Stuart Watt. Files. 1 of 2.
Read more

Information Fusion in XML Document Searches by Combining ...

Information Fusion in XML Document Searches by Combining Text and Image Retrieval Techniques. Uploaded by. Dian Tjondronegoro. Files. 1 of 2. 10.1.1.68 ...
Read more