50 %
50 %
Information about Harabagiu2005

Published on November 20, 2007

Author: Gourmet

Source: authorstream.com

Employing Two Question Answering Systems in TREC 2005:  Employing Two Question Answering Systems in TREC 2005 Harabagiu, Moldovan, et al 2005 Language Computer Corporation Highlights:  Highlights Two Systems PowerAnswer-2 : factoids (main task) PALANTIR : relationships Bells and whistles Web-boosting strategy Abductive logic prover World-knowledge axioms: XWN, SUMO,… Results : “above median for all groups” 53.4% Main task, 20.4% Relationships task TREC 2005:  TREC 2005 Tasks: Main (factoids), Relationships What’s new Question types: “Other” Answer types: Events Challenges More complex coreference resolution Temporal and other event-like constraints Discovering info nuggets for “Other” questions Challenges: Coreference resolution:  Challenges: Coreference resolution TREC 2004: single antecedent for anaphora TREC 2005: more candidate antecedents… Challenges: Inter-Question constraints:  Challenges: Inter-Question constraints A question and its answer constrain the subsequent questions Correct answer to Q136.5 depends on correct coreference resolution with previous Q’s correct answer to Q136.4 Event answer types Nominal answer types act as topics of subsequent questions; Events constrain subsequent questions with event-like properties: time, participants… The LCC Solution: Two Systems:  The LCC Solution: Two Systems PowerAnswer-2 Factoid questions Includes: Abductive logic, temporal reasoner, world-knowledge axioms Bonus: discover interesting and novel nuggets for “Other” questions PALANTIR Relationship questions Includes: keyword expansion, topic representation, automatic lexicon generation PowerAnswer-2: Architecture:  PowerAnswer-2: Architecture PowerAnswer-2: Components:  PowerAnswer-2: Components Standard modules: QP, PR, AP Question Processor, Passage Retrieval, Answer Processor Sneaky module: WebBooster Fancy module: COGEX Logic Prover World-knowledge: SUMO, eXtended WordNet, JAGUAR Linguistic knowledge: WordNet, manual ellipses and coreference axioms “Prove” correct answers with abductive logic Temporal inference from “advanced textual inference techniques” WebBooster:  WebBooster Exploit redundancy on web for answer ranking Construct series of search engine queries from “linguistic patterns” (morph/lex alternations?) Extract most redundant answers from web documents “Boost” (ie, increase weight of) answers from TREC collection that most closely match answers from web collection Justification: the larger the set, the easier it is to pinpoint answers that more closely resemble surface form of question Results: 20.8 % increase in factoid score COGEX: Logic Prover:  COGEX: Logic Prover Convert Question  QLF, Answer  ALF Perform “proof” on question over candidate answers Rank answers by semantic similarity to question Semantic similarity: WordNet! Ex: similarity of “buy” and “own” judged by length of connecting path in WordNet Results: 12.4 % increase in factoid score COGEX: Temporal Context Reasoner:  COGEX: Temporal Context Reasoner Document processing: index by dates Q and A processing: represent temporal relations as triples (S, E1, E2) S is temporal signal (“during”, “after”), Es are events Reasoning: Prefer passages that match detected temporal constraints in Q Discover events related by temporal signals in the Q and candidate As Perform temporal unification btw the Q and candidate As, boosting As that match Q times Results: 2 % increase in factoid score “Other” Questions:  “Other” Questions Generic definition-pattern based nuggets “...Russian submarine Kursk, which is lying on the sea bed in the Barents Sea...” Answer-type based nuggets Nugget-patterns pecific to properties of answer type 33 target classes generated by Naïve Bayes classifier on WordNet synsets Bing Crosby  musican_person: band, singer, born, … Entity-relationship based nuggets Nugget patterns are based on relations to other NEs Akira Kurosawa AND _date Akira Kurosawa AND _location … PALANTIR: Architecture:  PALANTIR: Architecture PALANTIR: Keyword Selection:  PALANTIR: Keyword Selection Collocation detection identify complete phrases that aren’t just bags of keywords (Organization of African States) Keyword Ranking detect overall importance of keyword in query Use keyword-density strategy for doc ranking Keyword Expansion Synonyms, alternate forms for keywords PALANIR: Topic Representation:  PALANIR: Topic Representation Harvest “topic signatures” from text ?? Find relationships between topic signatures Use syntax- and semantic-based relations between verbs and arguments Use context-based relations that exist between entities PALANTIR: Lexicon Generation:  PALANTIR: Lexicon Generation Q: Relationship questions have no single semantic answer type; how to identify appropriate answers from passages? A: By generating set-types on the fly, of course! Use weakly-supervised learning approach to identify semantic sets in question, then keywords relevant to that set (South American countries) Automatically generate a large db of syntactic frames that represent semantic relations Results:  Results PowerAnswer-2 PALANTIR Summary:  Summary WebBooster – 20% increase COGEX – 12% increase Temporal Reasoner – 2% increase Nugget-pattern discovery – 22.8% f-measure PALANTIR strategies:

Add a comment

Related presentations