Tut Prager

100 %
0 %
Information about Tut Prager

Published on March 28, 2008

Author: Waldarrama

Source: authorstream.com

Question Answering Tutorial:  Question Answering Tutorial John M. Prager IBM T.J. Watson Research Center jprager@us.ibm.com Tutorial Overview:  Tutorial Overview Ground Rules Part I - Anatomy of QA A Brief History of QA Terminology The essence of Text-based QA Basic Structure of a QA System NE Recognition and Answer Types Answer Extraction Part II - Specific Approaches By Genre By System Part III - Issues and Advanced Topics Evaluation No Answer Question Difficulty Dimensions of QA Relationship questions Decomposition/Recursive QA Constraint-based QA Cross-Language QA References Ground Rules:  Ground Rules Breaks Questions Topics Focus on English Text TREC & AQUAINT & beyond General Principles Tricks-of-the-Trade State-of-the-Art Methodologies My own System vs. My own Research Caution Caution:  Nothing in this Tutorial is true Caution Nothing in this Tutorial is true universally Part I - Anatomy of QA:  Part I - Anatomy of QA A Brief History of QA Terminology The Essence of Text-based QA Basic Structure of a QA System NE Recognition and Answer Types Answer Extraction A Brief History of QA:  A Brief History of QA NLP front-ends to Expert Systems SHRDLU (Winograd, 1972) User manipulated, and asked questions about, blocks world First real demo of combination of syntax, semantics, and reasoning NLP front-ends to Databases LUNAR (Woods,1973) User asked questions about moon rocks Used ATNs and procedural semantics LIFER/LADDER (Hendrix et al. 1977) User asked questions about U.S. Navy ships Used semantic grammar; domain information built into grammar NLP + logic CHAT-80 (Warren & Pereira, 1982) NLP query system in Prolog, about world geography Definite Clause Grammars “Modern Era of QA” MURAX (Kupiec, 2001) NLP front-end to Encyclopaedia NLP + hand-coded annotations to sources AskJeeves (www.ask.com) START (Katz, 1997) Started with text, extended to multimedia IR + NLP TREC-8 (1999) (Voorhees & Tice, 2000) Today – all of the above Some “factoid” questions from TREC8-9:  Some “factoid” questions from TREC8-9 9: How far is Yaroslavl from Moscow? 15: When was London's Docklands Light Railway constructed? 22: When did the Jurassic Period end? 29: What is the brightest star visible from Earth? 30: What are the Valdez Principles? 73: Where is the Taj Mahal? 134: Where is it planned to berth the merchant ship, Lane Victory, which Merchant Marine veterans are converting into a floating museum? 197: What did Richard Feynman say upon hearing he would receive the Nobel Prize in Physics? 198: How did Socrates die? 199: How tall is the Matterhorn? 200: How tall is the replica of the Matterhorn at Disneyland? 227: Where does dew come from? 269: Who was Picasso? 298: What is California's state tree? Terminology:  Terminology Question Type Answer Type Question Focus Question Topic Candidate Passage Candidate Answer Authority File/List Terminology – Question Type:  Terminology – Question Type Question Type: an idiomatic categorization of questions for purposes of distinguishing between different processing strategies and/or answer formats E.g. TREC2003 FACTOID: “How far is it from Earth to Mars?” LIST: “List the names of chewing gums” DEFINITION: “Who is Vlad the Impaler?” Other possibilities: RELATIONSHIP: “What is the connection between Valentina Tereshkova and Sally Ride?” SUPERLATIVE: “What is the largest city on Earth?” YES-NO: “Is Saddam Hussein alive?” OPINION: “What do most Americans think of gun control?” CAUSE&EFFECT: “Why did Iraq invade Kuwait?” … Terminology – Answer Type:  Terminology – Answer Type Answer Type: the class of object (or rhetorical type of sentence) sought by the question. E.g. PERSON (from “Who …”) PLACE (from “Where …”) DATE (from “When …”) NUMBER (from “How many …”) … but also EXPLANATION (from “Why …”) METHOD (from “How …”) … Answer types are usually tied intimately to the classes recognized by the system’s Named Entity Recognizer. Terminology – Question Focus:  Terminology – Question Focus Question Focus: The property or entity that is being sought by the question. E.g. “In what state is the Grand Canyon?” “What is the population of Bulgaria?” “What colour is a pomegranate?” Terminology – Question Topic:  Terminology – Question Topic Question Topic: the object (person, place, …) or event that the question is about. The question might well be about a property of the topic, which will be the question focus. E.g. “What is the height of Mt. Everest?” height is the focus Mt. Everest is the topic Terminology – Candidate Passage:  Terminology – Candidate Passage Candidate Passage: a text passage (anything from a single sentence to a whole document) retrieved by a search engine in response to a question. Depending on the query and kind of index used, there may or may not be a guarantee that a candidate passage has any candidate answers. Candidate passages will usually have associated scores, from the search engine. Terminology – Candidate Answer:  Terminology – Candidate Answer Candidate Answer: in the context of a question, a small quantity of text (anything from a single word to a sentence or bigger, but usually a noun phrase) that is of the same type as the Answer Type. In some systems, the type match may be approximate, if there is the concept of confusability. Candidate answers are found in candidate passages E.g. 50 Queen Elizabeth II September 8, 2003 by baking a mixture of flour and water Terminology – Authority List:  Terminology – Authority List Authority List (or File): a collection of instances of a class of interest, used to test a term for class membership. Instances should be derived from an authoritative source and be as close to complete as possible. Ideally, class is small, easily enumerated and with members with a limited number of lexical forms. Good: Days of week Planets Elements Good statistically, but difficult to get 100% recall: Animals Plants Colours Problematic People Organizations Impossible All numeric quantities Explanations and other clausal quantities Essence of Text-based QA:  Essence of Text-based QA Need to find a passage that answers the question. Find a candidate passage (search) Check that semantics of passage and question match Extract the answer (Single source answers) Essence of Text-based QA:  Essence of Text-based QA For a very small corpus, can consider every passage as a candidate, but this is not interesting Need to perform a search to locate good passages. If search is too broad, have not achieved that much, and are faced with lots of noise If search is too narrow, will miss good passages Search Two broad possibilities: Optimize search Use iteration Essence of Text-based QA:  Essence of Text-based QA Need to test whether semantics of passage match semantics of question Count question words present in passage Score based on proximity Score based on syntactic relationships Prove match Match Essence of Text-based QA:  Essence of Text-based QA Find candidate answers of same type as the answer type sought in question. Has implications for size of type hierarchy Where/when/whether to consider subsumption Consider later Answer Extraction Basic Structure of a QA-System:  Basic Structure of a QA-System See for example Abney et al., 2000; Clarke et al., 2001; Harabagiu et al.; Hovy et al., 2001; Prager et al. 2000 Question Analysis Answer Extraction Search Corpus or Web Question Answer Documents/ passages Query Answer Type Essence of Text-based QA:  Essence of Text-based QA Have three broad locations in the system where expansion takes place, for purposes of matching passages Where is the right trade-off? Question Analysis. Expand individual terms to synonyms (hypernyms, hyponyms, related terms) Reformulate question In Search Engine Generally avoided for reasons of computational expense At indexing time Stemming/lemmatization High-Level View of Recall Essence of Text-based QA:  Essence of Text-based QA Have three broad locations in the system where narrowing/filtering/matching takes place Where is the right trade-off? Question Analysis. Include all question terms in query Use IDF-style weighting to indicate preferences Search Engine Possibly store POS information for polysemous terms Answer Extraction Reward (penalize) passages/answers that (don’t) pass test Particularly attractive for temporal modification High-Level View of Precision Answer Types and Modifiers:  Answer Types and Modifiers Most likely there is no type for “French Cities” So will look for CITY include “French/France” in bag of words, and hope for the best include “French/France” in bag of words, retrieve documents, and look for evidence (deep parsing, logic) use high-precision Language Identification on results If you have a list of French cities, could either Filter results by list Use Answer-Based QA (see later) Use longitude/latitude information of cities and countries Name 5 French Cities Answer Types and Modifiers:  Answer Types and Modifiers Most likely there is no type for “female figure skater” Most likely there is no type for “figure skater” Look for PERSON, with query terms {figure, skater} What to do about “female”? Two approaches. Include “female” in the bag-of-words. Relies on logic that if “femaleness” is an interesting property, it might well be mentioned in answer passages. Does not apply to, say “singer”. Leave out “female” but test candidate answers for gender. Needs either an authority file or a heuristic test. Test may not be definitive. Name a female figure skater Named Entity Recognition:  Named Entity Recognition BBN’s IdentiFinder (Bikel et al. 1999) Hidden Markov Model Sheffield GATE (http://www.gate.ac.uk/) Development Environment for IE and other NLP activities IBM’s Textract/Resporator (Byrd & Ravin, 1999; Wacholder et al. 1997; Prager et al. 2000) FSMs and Authority Files + others Inventory of semantic classes recognized by NER related closely to set of answer types system can handle Named Entity Recognition:  Named Entity Recognition Probabilistic Labelling (IBM):  Probabilistic Labelling (IBM) In Textract, a Proper name can be one of the following PERSON PLACE ORGANIZATION MISC_ENTITY (e.g. names of Laws, Treaties, Reports, …) However, NER needs another class (UNAME) for any proper name it can’t identify. In a large corpus, many entities end up being UNAMEs. If, for example, a “Where” question seeks a PLACE, and similarly for the others above, then is being classified as UNAME a death sentence? How will a UNAME ever be searched for? Probabilistic Labelling (IBM):  Probabilistic Labelling (IBM) When entity is ambiguous or plain unknown, use a set of disjoint special labels in NER, instead of UNAME Assumes NER is able to rule out some possibilities, at least sometimes. Annotate with all remaining possibilities Use these labels as part of answer type E.g. UNP <-> could be a PERSON UNL <-> could be a PLACE UNO <-> could be an ORGANIZATION UNE <-> could be a MISC_ENTITY So {UNP UNL} <-> could be a PERSON or a PLACE This would be a good label for Beverly Hills Probabilistic Labelling (IBM):  Probabilistic Labelling (IBM) So “Who” questions that would normally generate {PERSON} as answer type, now generate {PERSON UNP} Question: “Who is David Beckham married to?” Answer Passage: “David Beckham, the soccer star engaged to marry Posh Spice, is being blamed for England 's World Cup defeat.” “Posh Spice” gets annotated with {UNP UNO} Match occurs, answer found. Crowd erupts! Issues with NER:  Issues with NER Coreference Should referring terms (definite noun phrases, pronouns) be labelled the same way as the referent terms? Nested Noun Phrases (and other structures of interest) What granularity? Partly depends on whether multiple annotations are allowed Subsumption and Ambiguity What label(s) to choose? Probabilistic labelling How to Annotate?:  How to Annotate? “… Baker will leave Jerusalem on Saturday and stop in Madrid on the way home to talk to Spanish Prime Minister Felipe Gonzales.” What about: The U.S. ambassador to Spain, Ed Romero ? Answer Extraction:  Answer Extraction Also called Answer Selection/Pinpointing Given a question and candidate passages, the process of selecting and ranking candidate answers. Usually, candidate answers are those terms in the passages which have the same answer type as that generated from the question Ranking the candidate answers depends on assessing how well the passage context relates to the question 3 Approaches: Heuristic features Shallow parse fragments Logical proof Answer Extraction using Features:  Answer Extraction using Features Heuristic feature sets (Prager et al. 2003+). See also (Radev at al. 2000) Calculate feature values for each CA, and then calculate linear combination using weights learned from training data. Ranking criteria: Good global context: the global context of a candidate answer evaluates the relevance of the passage from which the candidate answer is extracted to the question. Good local context the local context of a candidate answer assesses the likelihood that the answer fills in the gap in the question. Right semantic type the semantic type of a candidate answer should either be the same as or a subtype of the answer type identified by the question analysis component. Redundancy the degree of redundancy for a candidate answer increases as more instances of the answer occur in retrieved passages. Answer Extraction using Features (cont.):  Answer Extraction using Features (cont.) Features for Global Context KeywordsInPassage: the ratio of keywords present in a passage to the total number of keywords issued to the search engine. NPMatch: the number of words in noun phrases shared by both the question and the passage. SEScore: the ratio of the search engine score for a passage to the maximum achievable score. FirstPassage: a Boolean value which is true for the highest ranked passage returned by the search engine, and false for all other passages. Features for Local Context AvgDistance: the average distance between the candidate answer and keywords that occurred in the passage. NotInQuery: the number of words in the candidate answers that are not query keywords. Answer Extraction using Relationships:  Answer Extraction using Relationships Computing Ranking Scores – Linguistic knowledge to compute passage & candidate answer scores Perform syntactic processing on question and candidate passages Extract predicate-argument & modification relationships from parse Question: “Who wrote the Declaration of Independence?” Relationships: [X, write], [write, Declaration of Independence] Answer Text: “Jefferson wrote the Declaration of Independence.” Relationships: [Jefferson, write], [write, Declaration of Independence] Compute scores based on number of question relationship matches Passage score: consider all instantiated relationships Candidate answer scores: consider relationships with variable Answer Extraction using Relationships (cont.):  Answer Extraction using Relationships (cont.) Example: When did Amtrak begin operations? Question relationships [Amtrak, begin], [begin, operation], [X, begin] Compute passage scores: passages and relationships In 1971, Amtrak began operations,… [Amtrak, begin], [begin, operation], [1971, begin]… “Today, things are looking better,” said Claytor, expressing optimism about getting the additional federal funds in future years that will allow Amtrak to begin expanding its operations. [Amtrak, begin], [begin, expand], [expand, operation], [today, look]… Airfone, which began operations in 1984, has installed air-to-ground phones…. Airfone also operates Railfone, a public phone service on Amtrak trains. [Airfone, begin], [begin, operation], [1984, operation], [Amtrak, train]… Answer Extraction using Logic:  Answer Extraction using Logic Logical Proof Convert question to a goal Convert passage to set of logical forms representing individual assertions Add predicates representing subsumption rules, real-world knowledge Prove the goal See section on LCC later Question Answering Tutorial Part II:  Question Answering Tutorial Part II John M. Prager IBM T.J. Watson Research Center jprager@us.ibm.com Part II - Specific Approaches:  Part II - Specific Approaches By Genre Statistical QA Pattern-based QA Web-based QA Answer-based QA (TREC only) By System SMU LCC USC-ISI Insight Microsoft IBM Statistical IBM Rule-based Approaches by Genre:  Approaches by Genre By Genre Statistical QA Pattern-based QA Web-based QA Answer-based QA (TREC only) Web-based QA Database-based QA Considerations Effectiveness by question-type Precision and recall Expandability to other domains Ease of adaptation to CL-QA Statistical QA:  Statistical QA Use statistical distributions to model likelihoods of answer type and answer E.g. IBM (Ittycheriah, 2001) – see later section Pattern-based QA:  Pattern-based QA For a given question type, identify the typical syntactic constructions used in text to express answers to such questions Typically very high precision, but a lot of work to get decent recall Web-Based QA:  Web-Based QA Exhaustive string transformations Brill et al. 2002 Learning Radev et al. 2001 Answer-Based QA:  Answer-Based QA Problem: Sometimes it is very easy to find an answer to a question using resource A, but the task demands that you find it in resource B. Solution: First find the answer in resource A, then locate the same answer, along with original question terms, in resource B. Artificial problem, but real for TREC participants. Answer-Based QA:  Answer-Based QA Web-Based solution: When a QA system looks for answers within a relatively small textual collection, the chance of finding strings/sentences that closely match the question string is small. However, when a QA system looks for strings/sentences that closely match the question string on the web, the chance of finding correct answer is much higher. Hermjakob et al. 2002 Why this is true: The Web is much larger than the TREC Corpus (3,000 : 1) TREC questions are generated from Web logs, and the style of language (and subjects of interest) in these logs are more similar to the Web content than to newswire collections. Answer-Based QA:  Answer-Based QA Database/Knowledge-base/Ontology solution: When question syntax is simple and reliably recognizable, can express as a logical form Logical form represents entire semantics of question, and can be used to access structured resource: WordNet On-line dictionaries Tables of facts & figures Knowledge-bases such as Cyc Having found answer construct a query with original question terms + answer Retrieve passages Tell Answer Extraction the answer it is looking for Approaches of Specific Systems:  Approaches of Specific Systems SMU Falcon LCC USC-ISI Insight Microsoft IBM Note: Some of the slides and/or examples in these sections are taken from papers or presentations from the respective system authors SMU Falcon:  SMU Falcon Harabagiu et al. 2000 SMU Falcon:  SMU Falcon From question, dependency structure called question semantic form is created Query is Boolean conjunction of terms From answer passages that contain at least one instance of answer type, generate answer semantic form 3 processing loops: Loop 1 Triggered when too few or too many passages are retrieved from search engine Loop 2 Triggered when question semantic form and answer semantic form cannot be unified Loop 3 Triggered when unable to perform abductive proof of answer correctness SMU Falcon:  SMU Falcon Loops provide opportunities to perform alternations Loop 1: morphological expansions and nominalizations Loop 2: lexical alternations – synonyms, direct hypernyms and hyponyms Loop 3: paraphrases Evaluation (Pasca & Harabagiu, 2001). Increase in accuracy in 50-byte task in TREC9 Loop 1: 40% Loop 2: 52% Loop 3: 8% Combined: 76% LCC:  LCC Moldovan & Rus, 2001 Uses Logic Prover for answer justification Question logical form Candidate answers in logical form XWN glosses Linguistic axioms Lexical chains Inference engine attempts to verify answer by negating question and proving a contradiction If proof fails, predicates in question are gradually relaxed until proof succeeds or associated proof score is below a threshold. LCC: Lexical Chains:  LCC: Lexical Chains Q:1518 What year did Marco Polo travel to Asia? Answer: Marco polo divulged the truth after returning in 1292 from his travels, which included several months on Sumatra Lexical Chains: (1) travel_to:v#1 -> GLOSS -> travel:v#1 -> RGLOSS -> travel:n#1 (2) travel_to#1 -> GLOSS -> travel:v#1 -> HYPONYM -> return:v#1 (3) Sumatra:n#1 -> ISPART -> Indonesia:n#1 -> ISPART -> Southeast _Asia:n#1 -> ISPART -> Asia:n#1 Q:1570 What is the legal age to vote in Argentina? Answer: Voting is mandatory for all Argentines aged over 18. Lexical Chains: (1) legal:a#1 -> GLOSS -> rule:n#1 -> RGLOSS -> mandatory:a#1 (2) age:n#1 -> RGLOSS -> aged:a#3 (3) Argentine:a#1 -> GLOSS -> Argentina:n#1 LCC: Logic Prover:  LCC: Logic Prover Question Which company created the Internet Browser Mosaic? QLF: (_organization_AT(x2) ) & company_NN(x2) & create_VB(e1,x2,x6) & Internet_NN(x3) & browser_NN(x4) & Mosaic_NN(x5) & nn_NNC(x6,x3,x4,x5) Answer passage ... Mosaic , developed by the National Center for Supercomputing Applications ( NCSA ) at the University of Illinois at Urbana - Champaign ... ALF: ... Mosaic_NN(x2) & develop_VB(e2,x2,x31) & by_IN(e2,x8) & National_NN(x3) & Center_NN(x4) & for_NN(x5) & Supercomputing_NN(x6) & application_NN(x7) & nn_NNC(x8,x3,x4,x5,x6,x7) & NCSA_NN(x9) & at_IN(e2,x15) & University_NN(x10) & of_NN(x11) & Illinois_NN(x12) & at_NN(x13) & Urbana_NN(x14) & nn_NNC(x15,x10,x11,x12,x13,x14) & Champaign_NN(x16) ... Lexical Chains develop <-> make and make <->create exists x2 x3 x4 all e2 x1 x7 (develop_vb(e2,x7,x1) <-> make_vb(e2,x7,x1) & something_nn(x1) & new_jj(x1) & such_jj(x1) & product_nn(x2) & or_cc(x4,x1,x3) & mental_jj(x3) & artistic_jj(x3) & creation_nn(x3)). all e1 x1 x2 (make_vb(e1,x1,x2) <-> create_vb(e1,x1,x2) & manufacture_vb(e1,x1,x2) & man-made_jj(x2) & product_nn(x2)). Linguistic axioms all x0 (mosaic_nn(x0) -> internet_nn(x0) & browser_nn(x0)) USC-ISI:  USC-ISI Textmap system Ravichandran and Hovy, 2002 Hermjakob et al. 2003 Use of Surface Text Patterns When was X born -> Mozart was born in 1756 Gandhi (1869-1948) Can be captured in expressions <NAME> was born in <BIRTHDATE> <NAME> (<BIRTHDATE> - These patterns can be learned USC-ISI TextMap:  USC-ISI TextMap Use bootstrapping to learn patterns. For an identified question type (“When was X born?”), start with known answers for some values of X Mozart 1756 Gandhi 1869 Newton 1642 Issue Web search engine queries (e.g. “+Mozart +1756” ) Collect top 1000 documents Filter, tokenize, smooth etc. Use suffix tree constructor to find best substrings, e.g. Mozart (1756-1791) Filter Mozart (1756- Replace query strings with e.g. <NAME> and <ANSWER> Determine precision of each pattern Find documents with just question term (Mozart) Apply patterns and calculate precision USC-ISI TextMap:  USC-ISI TextMap Finding Answers Determine Question type Perform IR Query Do sentence segmentation and smoothing Replace question term by question tag i.e. replace Mozart with <NAME> Search for instances of patterns associated with question type Select words matching <ANSWER> Assign scores according to precision of pattern Insight:  Insight Soubbotin, 2002. Soubbotin & Soubbotin, 2003. Performed very well in TREC10/11 Comprehensive and systematic use of “Indicative patterns” E.g. cap word; paren; 4 digits; dash; 4 digits; paren matches Mozart (1756-1791) The patterns are broader than named entities “Semantics in syntax” Patterns have intrinsic scores (reliability), independent of question Insight:  Insight Patterns with more sophisticated internal structure are more indicative of answer 2/3 of their correct entries in TREC10 were answered by patterns E.g. a == {countries} b == {official posts} w == {proper names (first and last)} e == {titles or honorifics} Patterns for “Who is the President (Prime Minister) of given country? abeww ewwdb,a b,aeww Definition questions: (A is primary query term, X is answer) <A; comma; [a/an/the]; X; [comma/period]> For: “Moulin Rouge, a cabaret” <X; [comma]; [also] called; A [comma]> For: “naturally occurring gas called methane” <A; is/are; [a/an/the]; X> For: “Michigan’s state flower is the apple blossom” Insight:  Insight Emphasis on shallow techniques, lack of NLP Look in vicinity of text string potentially matching pattern for “zeroing” – e.g. for occupational roles: Former Elect Deputy Negation Comments: Relies on redundancy of large corpus Works for factoid question types of TREC-QA – not clear how it extends Not clear how they match questions to patterns Named entities within patterns have to be recognized Microsoft:  Microsoft Data-Intensive QA. Brill et al. 2002 “Overcoming the surface string mismatch between the question formulation and the string containing the answer” Approach based on the assumption/intuition that someone on the Web has answered the question in the same way it was asked. Want to avoid dealing with: Lexical, syntactic, semantic relationships (bet. Q & A) Anaphora resolution Synonymy Alternate syntax Indirect answers Take advantage of redundancy on Web, then project to TREC corpus (Answer-based QA) Microsoft AskMSR:  Microsoft AskMSR Formulate multiple queries – each rewrite has intrinsic score. E.g. for “What is relative humidity?” [“+is relative humidity”, LEFT, 5] [“relative +is humidity”, RIGHT, 5] [“relative humidity +is”, RIGHT, 5] [“relative humidity”, NULL, 2] [“relative” AND “humidity”, NULL, 1] Get top 100 documents from Google Extract n-grams from document summaries Score n-grams by summing the scores of the rewrites it came from Use tiling to merge n-grams Search for supporting documents in TREC corpus Microsoft AskMSR:  Microsoft AskMSR Question is: “What is the rainiest place on Earth” Answer from Web is: “Mount Waialeale” Passage in TREC corpus is: “… In misty Seattle, Wash., last year, 32 inches of rain fell. Hong Kong gets about 80 inches a year, and even Pago Pago, noted for its prodigious showers, gets only about 196 inches annually. (The titleholder, according to the National Geographic Society, is Mount Waialeale in Hawaii, where about 460 inches of rain falls each year.) …” Very difficult to imagine getting this passage by other means IBM Statistical QA (Ittycheriah, 2001):  IBM Statistical QA (Ittycheriah, 2001) ATM predicts, from the question and a proposed answer, the answer type they both satisfy Given a question, an answer, and the predicted answer type, ASM seeks to model the correctness of this configuration. Distributions are modelled using a maximum entropy formulation Training data = human judgments For ATM, 13K questions annotated with 31 categories For ASM, ~ 5K questions from TREC plus trivia p(c|q,a) = Se p(c,e|q,a) = Se p(c|e,q,a) p(e|q,a) q = question a = answer c = “correctness” e = answer type p(e|q,a) is the answer type model (ATM) p(c|e,q,a) is the answer selection model (ASM) IBM Statistical QA (Ittycheriah):  IBM Statistical QA (Ittycheriah) Question Analysis (by ATM) Selects one out of 31 categories Search Question expanded by Local Context Analysis Top 1000 documents retrieved Passage Extraction: Top 100 passages that: Maximize question word match Have desired answer type Minimize dispersion of question words Have similar syntactic structure to question Answer Extraction: Candidate answers ranked using ASM IBM Rule-based:  IBM Rule-based Predictive Annotation (Prager 2000, Prager 2003) Want to make sure passages retrieved by search engine have at least one candidate answer Recognize that candidate answer is of correct answer type which corresponds to a label (or several) generated by Named Entity Recognizer Annotate entire corpus and index semantic labels along with text Identify answer types in questions and include corresponding labels in queries IBM PIQUANT:  IBM PIQUANT Predictive Annotation – E.g.: Question is “Who invented baseball?” “Who” can map to PERSON$ or ORGANIZATION$ Suppose we assume only people invent things (it doesn’t really matter). So “Who invented baseball?” -> {PERSON$ invent baseball} Consider text “… but its conclusion was based largely on the recollections of a man named Abner Graves, an elderly mining engineer, who reported that baseball had been "invented" by Doubleday between 1839 and 1841. ” IBM PIQUANT:  IBM PIQUANT Predictive Annotation – Previous example “Who invented baseball?” -> {PERSON$ invent baseball} However, same structure is equally effective at answering “What sport did Doubleday invent?” -> {SPORT$ invent Doubleday} IBM Rule-Based:  IBM Rule-Based Handling Subsumption & Disjunction If an entity is of a type which has a parent type, then how is annotation done? If a proposed answer type has a parent type, then what answer type should be used? If an entity is ambiguous then what should the annotation be? If the answer type is ambiguous, then what should be used? Guidelines: If an entity is of a type which has a parent type, then how is annotation done? If a proposed answer type has a parent type, then what answer type should be used? If an entity is ambiguous then what should the annotation be? If the answer type is ambiguous, then what should be used? Subsumption & Disjunction:  Subsumption & Disjunction Consider New York City – both a CITY and a PLACE To answer “Where did John Lennon die?”, it needs to be a PLACE To answer “In what city is the Empire State Building?”, it needs to be a CITY. Do NOT want to do subsumption calculation in search engine Two scenarios 1. Expand Answer Type and use most specific entity annotation 1A { (CITY PLACE) John_Lennon die} matches CITY 1B {CITY Empire_State_Building} matches CITY Or 2. Use most specific Answer Type and multiple annotations of NYC 2A {PLACE John_Lennon die} matches (CITY PLACE) 2B {CITY Empire_State_Building} matches (CITY PLACE) Case 2 preferred for simplicity, because disjunction in #1 should contain all hyponyms of PLACE, while disjunction in #2 should contain all hypernyms of CITY Choice #2 suggests can use disjunction in answer type to represent ambiguity: “Who invented the laser” -> {(PERSON ORGANIZATION) invent laser} Clausal classes:  Clausal classes Any structure that can be recognized in text can be annotated. Quotations Explanations Methods Opinions … Any semantic class label used in annotation can be indexed, and hence used as a target of search: What did Karl Marx say about religion? Why is the sky blue? How do you make bread? What does Arnold Schwarzenegger think about global warming? … Named Entity Recognition:  Named Entity Recognition IBM:  IBM Predictive Annotation – Improving Precision at no cost to Recall E.g.: Question is “Where is Belize?” “Where” can map to (CONTINENT$, WORLDREGION$, COUNTRY$, STATE$, CITY$, CAPITAL$, LAKE$, RIVER$ … ). But we know Belize is a country. So “Where is Belize?” -> {(CONTINENT$ WORLDREGION$) Belize} Belize occurs 1068 times in TREC corpus Belize and PLACE$ co-occur in only 537 sentences Belize and CONTINENT$ or WORLDREGION$ co-occur in only 128 sentences Virtual Annotation (Prager 2001):  Virtual Annotation (Prager 2001) Use WordNet to find all candidate answers (hypernyms) Use corpus co-occurrence statistics to select “best” ones Rather like approach to WSD by Mihalcea and Moldovan (1999) Parentage of “nematode”:  Parentage of “nematode” Parentage of “meerkat”:  Parentage of “meerkat” Natural Categories:  Natural Categories “Basic Objects in Natural Categories” Rosch et al. (1976) According to psychological testing, these are categorization levels of intermediate specificity that people tend to use in unconstrained settings. What is this?:  What is this? What can we conclude?:  What can we conclude? There are descriptive terms that people are drawn to use naturally. We can expect to find instances of these in text, in the right contexts. These terms will serve as good answers. Virtual Annotation (cont.):  Virtual Annotation (cont.) Find all parents of query term in WordNet Look for co-occurrences of query term and parent in text corpus Expect to find snippets such as: “… meerkats and other Y …” Many different phrasings are possible, so we just look for proximity, rather than parse. Scoring: Count co-occurrences of each parent with search term, and divide by level number (only levels >= 1), generating Level-Adapted Count (LAC). Exclude very highest levels (too general). Select parent with highest LAC plus any others with LAC within 20%. Parentage of “nematode”:  Parentage of “nematode” Parentage of “meerkat”:  Parentage of “meerkat” Sample Answer Passages:  Sample Answer Passages “What is a nematode?” -> “Such genes have been found in nematode worms but not yet in higher animals.” “What is a meerkat?” -> “South African golfer Butch Kruger had a good round going in the central Orange Free State trials, until a mongoose-like animal grabbed his ball with its mouth and dropped down its hole. Kruger wrote on his card: "Meerkat."” Use Answer-based QA to locate answers Use of Cyc as Sanity Checker:  Use of Cyc as Sanity Checker Cyc: Large Knowledge-base and Inference engine (Lenat 1995) A post-hoc process for Rejecting “insane” answers How much does a grey wolf weigh? 300 tons Boosting confidence for “sane” answers Sanity checker invoked with Predicate, e.g. “weight” Focus, e.g. “grey wolf” Candidate value, e.g. “300 tons” Sanity checker returns “Sane”: + or – 10% of value in Cyc “Insane”: outside of the reasonable range Plan to use distributions instead of ranges “Don’t know” Confidence score highly boosted when answer is “sane” Cyc Sanity Checking Example:  Cyc Sanity Checking Example Trec11 Q: “What is the population of Maryland?” Without sanity checking PIQUANT’s top answer: “50,000” Justification: “Maryland’s population is 50,000 and growing rapidly.” Passage discusses an exotic species “nutria”, not humans With sanity checking Cyc knows the population of Maryland is 5,296,486 It rejects the top “insane” answers PIQUANT’s new top answer: “5.1 million” with very high confidence Question Answering Tutorial Part III:  Question Answering Tutorial Part III John M. Prager IBM T.J. Watson Research Center jprager@us.ibm.com Part III – Issues, Advanced Topics:  Part III – Issues, Advanced Topics Evaluation No Answer Question Difficulty Future of QA/Hot topics Dimensions of QA Relationship questions Decomposition / Recursive QA Constraint-based QA Cross-Language QA Evaluation:  Evaluation Relatively straightforward for “factoid” questions. TREC-8 (1999) & TREC-9 (2000) 50-byte and 250-byte tasks Systems returned top 5 answers Mean Reciprocal Rank 1 point if top answer is correct, else 0.5 point if second answer is correct, else … 0.2 point if fifth answer is correct, else 0 Evaluation:  Evaluation For each question, a set of “correct” answers “Correctness” testing is easy to automate with pattern files, but patterns are subjective Patterns don’t/can’t test for justification Evaluation:  Evaluation TREC-10 (2001) Dropped 250-byte task Introduced NIL (No Answer ) questions TREC-11 (2002) Instead of top 5 answers, systems returned top 1 Answer must be “exact” Definition questions (“What/who is X?”) dropped Results returned sorted in order of system’s confidence Scored by Confidence Weighted Score (= Average Precision) TREC-12 (2003) Definition questions re-introduced, but answers assumed to be a collection of “nuggets” List questions introduced, answers must be exact Definition and List questions evaluated by F-measure biased to favour recall Factoid questions evaluated by fraction correct Confidence-Weighted Score (Average Precision):  Confidence-Weighted Score (Average Precision) = average of N different precision measures Score1 participates in every term Score2 participates in all but first, … ScoreN participates in just last term Much more weight given to early terms in sum Contribution by Rank Position:  Contribution by Rank Position For N questions, if contribution of correct answer in position k is ck ck = ck+1 + 1/kN cN+1 = 0 → N =500 Average Precision:  Average Precision N =500 Evaluation Issues:  Evaluation Issues What is really meant by “exact answer”? What if there is a mistake in question? Suppose question is “Who said X?”, where X is a famous saying with a mistake in it. Maybe the answer is NIL What granularity is required? “Where is Chicago?” “What is acetominophen?” Difficult to answer without model of user. Questions with No Answer:  Questions with No Answer Subtle difference between: This question has no answer (within the available resources), This question has no answer (at all), and I don’t know the answer TREC-QA tests #1 (“NIL questions”), but systems typically answer as if #3 Strategies used: When allowed top 5 answers (with confidences) Always put NIL in position X (X in {2,3,4,5}) If some criterion succeeds, put NIL in position X (X in {1,2,3,4,5}) Determine some threshold T, and insert NIL at corresponding position in confidence ranking (1-5, or not) When single answer Determine some threshold T, and insert NIL if answer confidence < T NIL and CWS:  NIL and CWS When Confidence-Weighted Score is used, what should the NIL strategy be? If an answer has low confidence and is replaced by NIL, then what is its new confidence? Study strategy used by IBM in TREC11 (Chu-Carroll et al. 2003) No-Answer Confidence-Based Calculation :  No-Answer Confidence-Based Calculation Use TREC10 Data to determine strategy and thresholds Observe that lowest-confidence questions are more often No-Answer than correct Examine TREC10 distribution to determine cut-off threshold. Convert all questions below this to NIL. Improves average confidence of block. Move converted block to rank with same average precision. Confidences based on Grammatical Relationships Semantic Relationships Redundancy TREC10 Distribution:  TREC10 Distribution NIL CORRECT OUT OF xxxxxxxxxxxxxx.xx.xxxxxxxxxxxxxx.x..xx.xx 0 35 41 xxxxxx-x.-x.xxxxxxxx..x-xxxxxxxxxx.xxxxx.x.xxx.-xx 4 38 50 xx.....x.-xx.....xx....x.xx.x..xxx.xx...xx.x..xx.x 1 22 50 .-...x.xx-..x..x.xx....xx.x...xx.....x..xxx....xx. 2 18 50 ........x....x..xxxx...x...xx....xxxxx--......xxx. 2 17 50 ..x.xxx...-x-...xx.....x...xx--.xx-....xx..x..x... 5 16 50 ..x.x.-......x....x.x-.x.xx...-x-x-x-...-..x-x.x.x 8 15 50 x..-x.....x.x.....-..........-...-..x.-....-..x... 6 6 50 .x--......xx....-.-..x.-....-.-..x...........--... 9 5 50 -.-.-..--...-x.xx....-.-x......-.....-..-...-.x.-. 13 5 50 Key: X Correct . Incorrect - NIL TREC10 Distribution:  Changing all answers in block to NIL gains 22-10 = 12 correct. Note confidence of leading element = C. TREC10 Distribution NIL CORRECT OUT OF xxxxxxxxxxxxxx.xx.xxxxxxxxxxxxxx.x..xx.xx 0 35 41 xxxxxx-x.-x.xxxxxxxx..x-xxxxxxxxxx.xxxxx.x.xxx.-xx 4 38 50 xx.....x.-xx.....xx....x.xx.x..xxx.xx...xx.x..xx.x 1 22 50 .-...x.xx-..x..x.xx....xx.x...xx.....x..xxx....xx. 2 18 50 ........x....x..xxxx...x...xx....xxxxx--......xxx. 2 17 50 ..x.xxx...-x-...xx.....x...xx--.xx-....xx..x..x... 5 16 50 ..x.x.-......x....x.x-.x.xx...-x-x-x-...-..x-x.x.x 8 15 50 x..-x.....x.x.....-..........-...-..x.-....-..x... 6 6 50 .x--......xx....-.-..x.-....-.-..x...........--... 9 5 50 -.-.-..--...-x.xx....-.-x......-.....-..-...-.x.-. 13 5 50 Key: X Correct . Incorrect - NIL C TREC10 Distribution:  Changing all answers in block to NIL gains 22-10 = 12 correct. Note confidence of leading element = C. TREC10 Distribution NIL CORRECT OUT OF xxxxxxxxxxxxxx.xx.xxxxxxxxxxxxxx.x..xx.xx 0 35 41 xxxxxx-x.-x.xxxxxxxx..x-xxxxxxxxxx.xxxxx.x.xxx.-xx 4 38 50 xx.....x.-xx.....xx....x.xx.x..xxx.xx...xx.x..xx.x 1 22 50 .-...x.xx-..x..x.xx....xx.x...xx.....x..xxx....xx. 2 18 50 ........x....x..xxxx...x...xx....xxxxx--......xxx. 2 17 50 ..x.xxx...-x-...xx.....x...xx--.xx-....xx..x..x... 5 16 50 ..x.x.-......x....x.x-.x.xx...-x-x-x-...-..x-x.x.x 8 15 50 x..-x.....x.x.....-..........-...-..x.-....-..x... 6 6 50 ..xx............x.x....x....x.x..............xx... all 9 50 x.x.x..xx...x........x.x.......x.....x..x...x...x. all 13 50 Key: X Correct . Incorrect - NIL C TREC10 Distribution:  Changing all answers in block to NIL gains 22-10 = 12 correct. Note confidence of leading element = C. TREC10 Distribution NIL CORRECT OUT OF xxxxxxxxxxxxxx.xx.xxxxxxxxxxxxxx.x..xx.xx 0 35 41 xxxxxx-x.-x.xxxxxxxx..x-xxxxxxxxxx.xxxxx.x.xxx.-xx 4 38 50 xx.....x.-xx.....xx....x.xx.x..xxx.xx...xx.x..xx.x 1 22 50 .-...x.xx-..x..x.xx....xx.x...xx.....x..xxx....xx. 2 18 50 ........x....x..xxxx...x...xx....xxxxx--......xxx. 2 17 50 ..x.xxx...-x-...xx.....x...xx--.xx-....xx..x..x... 5 16 50 ..x.x.-......x....x.x-.x.xx...-x-x-x-...-..x-x.x.x 8 15 50 x..-x.....x.x.....-..........-...-..x.-....-..x... 6 6 50 ..xx............x.x....x....x.x..............xx... all 9 50 x.x.x..xx...x........x.x.......x.....x..x...x...x. all 13 50 Key: X Correct . Incorrect - NIL Calculate precision of block P = 22/100 C TREC10 Distribution:  Changing all answers in block to NIL gains 22-10 = 12 correct. Note confidence of leading element = C. TREC10 Distribution NIL CORRECT OUT OF xxxxxxxxxxxxxx.xx.xxxxxxxxxxxxxx.x..xx.xx 0 35 41 xxxxxx-x.-x.xxxxxxxx..x-xxxxxxxxxx.xxxxx.x.xxx.-xx 4 38 50 xx.....x.-xx.....xx....x.xx.x..xxx.xx...xx.x..xx.x 1 22 50 .-...x.xx-..x..x.xx....xx.x...xx.....x..xxx....xx. 2 18 50 ........x....x..xxxx...x...xx....xxxxx--......xxx. 2 17 50 ..x.xxx...-x-...xx.....x...xx--.xx-....xx..x..x... 5 16 50 ..x.x.-......x....x.x-.x.xx...-x-x-x-...-..x-x.x.x 8 15 50 x..-x.....x.x.....-..........-...-..x.-....-..x... 6 6 50 ..xx............x.x....x....x.x..............xx... all 9 50 x.x.x..xx...x........x.x.......x.....x..x...x...x. all 13 50 Key: X Correct . Incorrect - NIL Calculate precision of block P = 22/100 Calculate point with same local precision P. Note confidence K. C K NIL Placement in TREC11 Answers:  ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? NIL Placement in TREC11 Answers C Sorted by confidence, but correctness unknown Find point with confidence C. (Block is of size 147) NIL Placement in TREC11 Answers:  NIL Placement in TREC11 Answers ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ???????????????----------------------------------- -------------------------------------------------- -------------------------------------------------- ------------?????????????????????????????????????? ?????????????????????????????????????????????????? Find point with confidence K. Insert block at this point. Subtract C from all confidences to the right. Sorted by confidence, but correctness unknown Find point with confidence C. (Block is of size 147) Find point with confidence C. (Block is of size 147) Make all answers in block NIL, and add K-C to each confidence. NIL Placement in TREC11 Answers - Impact:  NIL Placement in TREC11 Answers - Impact ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ???????????????----------------------------------- -------------------------------------------------- -------------------------------------------------- ------------?????????????????????????????????????? ?????????????????????????????????????????????????? 29 out of 46 NIL answers located – recall of .63 9 previously-correct answers lost Total of 20 correct questions gained … 20/500 = 4% Minimal (< 0.5%) improvement in final AP score Question Complexity:  Question Complexity “Simple” questions are not a solved problem: Complex questions can be decomposed into simpler components. If simpler questions cannot be handled successfully, there’s no hope for more complex ones. Areas not explored (intentionally) by TREC to date: spelling errors grammatical errors syntactic precision e.g. significance of articles “not”, “only”, “just” … Question Complexity:  Question Complexity When was Queen Victoria born? … King George III’s only granddaughter to survive infancy was born in 1819 … … Victoria was the only daughter of Edward, Duke of Kent … … George III’s fourth son Edward became Duke of Kent … All of the current leading economic indicators point in the direction of the Federal Reserve Bank raising interest rates at next week’s meeting. Alan Greenspan, Fed chairman. 42. (The Hitchhiker’s Guide to the Galaxy) Should the Fed raise interest rates? What is the meaning of life? Question Complexity:  Question Complexity Not a function of question alone, but rather the pair {question, corpus} In general, it is a function of the question and the resources to answer it, which include text corpora, databases, knowledge bases, ontologies and processing modules Complexity ≡ Impedance Match Future of QA:  Future of QA By fixing resources, can make factoid QA more difficult by intentionally exploiting requirements for advanced NLP and/or reasoning Questions that require more than one resource / document for an answer E.g. What is the relationship between A and B? Question decomposition Cross-language QA How to advance the field Dimensions of QA:  Dimensions of QA “Answer Topology” Characteristics of correct answer set Language Vocabulary & Syntax Question as a problem Enumeration, arithmetic, inference User Model Who’s asking the question Opinions, hypotheses, predictions, beliefs Answer Set Topology:  Answer Set Topology No Answer, one, many When are two different answers the same – Natural variation Size of an elephant Estimation Populations Variation over time Populations, Prime Ministers Choose correct presentation format Lists, charts, graphs, dialogues Language:  Language The biggest current roadblock to Question Answering is arguably Natural Language: Anaphora Definite Noun Phrases Synonyms Subsumption Metonyms Paraphrases Negation & other such qualification Nonce words Idioms Figures of speech Poetic & other stylistic variations … Negation (1):  Negation (1) Q: Who invented the electric guitar? A: While Mr. Fender did not invent the electric guitar, he did revolutionize and perfect it. Note: Not all instances of “not” will invalidate a passage. Questions as Word Problems:  Questions as Word Problems Text Match Find text that says “London is the largest city in England” (or paraphrase). “Superlative” Search Find a table of English cities and their populations, and sort. Find a list of the 10 largest cities in the world, and see which are in England. Uses logic: if L > all objects in set R then L > all objects in set E < R. Find the population of as many individual English cities as possible, and choose the largest. Heuristics London is the capital of England. (Not guaranteed to imply it is the largest city, but quite likely.) Complex Inference E.g. “Birmingham is England’s second-largest city”; “Paris is larger than Birmingham”; “London is larger than Paris”; “London is in England”. What is the largest city in England? Negation (2):  Negation (2) Name a US state where cars are manufactured. versus Name a US state where cars are not manufactured. Certain kinds of negative events or instances are rarely asserted explicitly in text, but must be deduced by other means Other Adverbial Modifiers (Only, Just etc.):  Other Adverbial Modifiers (Only, Just etc.) Name an astronaut who nearly made it to the moon To satisfactorily answer such questions, need to know what are the different ways in which events can fail to happen. In this case there are several. Need for User Model:  Need for User Model What is meant? The city: what granularity is required? The rock group The play/movie The sports team (which one?) Can hardly choose the right answer without knowing who is asking the question, and why. Where is Chicago? What is mold? Not all “What is” Questions are definitional:  Not all “What is” Questions are definitional Subclass or instance What is a powerful adhesive? Distinction from co-members of class What is a star fruit? Value or more common synonym What is a nanometer? What is rubella? Subclass/instance with property What is a yellow spotted lizard? Ambiguous: definition or instance What is an antacid? From a Web log: Attention to Details:  Attention to Details Tenses Who is the Prime Minister of Japan? Number What are the largest snakes in the world? Articles What is mold? Where is the Taj Mahal? ^ ^ Opinions, Hypotheses, Predictions and Beliefs:  Opinions, Hypotheses, Predictions and Beliefs What does X think about Y? Will X happen? ‘ “X will happen”, says Dr. A’ ‘Prof. B believes that X will happen.’ ‘X will happen’ (asserted by article writer) e.g. Is global warming real? What is appropriate for QA?:  What is appropriate for QA? How much emphasis should be placed on: Retrieval Built-in knowledge Computation Estimation Inference Sample questions What is one plus one? How many $2 pencils can I buy for $10? How many genders are there? How many legs does a person have? How many books are there in a local library? What was the dilemma facing Hamlet? Relationship Questions:  Relationship Questions An exercise in the ARDA AQUAINT program. “What has been the relationship between Osama bin Laden and Sudan?” “What does Soviet Cosmonaut Valentina Tereshkova (Vladinrouna) and U.S. Astronaut Sally Ride have in common?” “What is the connection between actor and comedian Chris Rock and former Washington, D.C. mayor Marion Barry?” Two approaches (Cycorp and IBM) Cycorp Approach:  Cycorp Approach Use original question terms as IR query Break top retrieved documents into sentences Generate Bayesian network with words as nodes from Sentence x Word matrix Select ancestor terms to augment query E.g. “What is the connection between actor and comedian Chris Rock and former Washington, D.C. mayor Marion Barry?” Augmentation terms = {drug, arrested} Iterate but where new network has sentences as nodes Output sentences that are neighbours of augmented query Single Strategy IBM Approach:  IBM Approach Extending pattern-based agent “What is the relationship between X and Y?” -> locate syntactic contexts with X and Y: conjunction subject-verb-object objects of prepositions. New profile-based agent Local Context Analysis on documents containing either X or Y Form vector of terms, normalize, intersect, sort “What do Valentina Tereshkova and Sally Ride have in common?” -> Space First Woman Collins (the first woman to ever fly the space shuttle) Multi-part Strategy, including: Decomposition/Recursive QA:  Decomposition/Recursive QA “Who/What is X” require a profile of the subject – QA-by-Dossier Can generate auxiliary questions based on type of question focus. When/where was X born? When/where/how did X die? What occupation did X have? Can generate follow-up questions based on earlier answers What did X win? What did X write? What did X discover? Constraint-based QA:  Constraint-based QA QA-by-Dossier-with-Constraints Variation of QA-by-Dossier Ask auxiliary questions that constrain the answer to the original question. Prager et al. (submitted) When did Leonardo paint the Mona Lisa?:  When did Leonardo paint the Mona Lisa? Constraints:  Capitalize on existence of natural relationships between events/situations that can be used as constraints E.g. A person’s achievements occurred during his/her lifetime. Develop constraints for a person and an achievement event: date(died) <= date(born) + 100 date(event) >= date(born) + 10 date(event) <= date(died) For each constraint variable, ask Auxiliary Question to generate set of candidate answers, e.g. When was Leonardo born? When did Leonardo die? Constraints Auxiliary Questions:  Auxiliary Questions When was Leonardo born? When did Leonardo die? Dossier-with-Constraints Process:  Dossier-with-Constraints Process Original Question Auxiliary Questions Constraints Constraint Satisfaction + Confidence Combination + + Cross-Language QA:  Cross-Language QA Probably easiest approach is to translate question to language of collection, and perform monolingual QA All considerations that apply to CL-IR apply to CL-QA, and then some: Named Entity Recognition Parsers Ontologies … Cross-Language QA:  Cross-Language QA Jung and Lee, 2002. User Query -> NLP -> SQL -> Relational Database Morphological Analysis and Linguistic Resources are language dependent. Generate Lexico-Semantic patterns Cross-Language QA:  Cross-Language QA TREC CLIR for several years CLEF (Cross-Language Evaluation Forum) http://clef.iei.pi.cnr.it:2002/ CLIR activities for several years CL-QA in 2003 http://clef-qa.itc.it/ References:  References Abney, S., Collins, M. and Singhal, A. “Answer Extraction”. In Proceedings ANLP 2000. E. Brill, J. Lin, M. Banko, S. Dumais and A. Ng, “Data-Intensive Question Answering”, in Proceedings of the 10th Text Retrieval Conference (TREC-2001), NIST, Gaithersburg, MD, 2002. D. Bikel, R. Schwartz, R. Weischedel, "An Algorithm that Learns What's in a Name," Machine Learning, 1999. Byrd, R. and Ravin, Y. “Identifying and Extracting Relations in Text.” In Proceedings of NLDB 99, Klagenfurt, Austria, 1999. Jennifer Chu-Carroll, John Prager, Christopher Welty, Krzysztof Czuba and David Ferrucci. "A Multi-Strategy and Multi-Source Approach to Question Answering", Proceedings of TREC2002, Gaithersburg, MD, 2003. Clarke, C.L.A., Cormack, G.V., Kisman, D.I.E. and Lynam, T.R. “Question answering by passage selection (Multitext experiments for TREC-9)” in Proceedings of the 9th Text Retrieval Conference, pp. 673-683, NIST, Gaithersburg, MD, 2001. Sanda Harabagiu, Dan Moldovan, Marius Pasca, Rada Mihalcea, Mihai Surdeanu, Razvan Bunescu, Roxana Girju, Vasile Rus and Paul Morarescu, FALCON: Boosting Knowledge for Answer Engines, in Proceedings of the 9th Text Retrieval Conference, pp. 479-488, NIST, Gaithersburg MD, 2001. Sanda Harabagiu, Dan Moldovan, Marius Pasca, Rada Mihalcea, Mihai Surdeanu, Razvan Bunescu, Roxana Girju, Vasile Rus and Paul Morarescu, The Role of Lexico-Semantic Feedback in Open-Domain Textual Question-Answering, in Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL-2001), July 2001, Toulouse France, pages 274-281. Gary G. Hendrix, Earl D. Sacerdoti, Daniel Sagalowicz, Jonathan Slocum: Developing a Natural Language Interface to Complex Data. VLDB 1977: 292 References:  References Hovy, E., Gerber, L., Hermjakob, U., Junk, M., and Lin, C-Y. “Question answering in Webclopedia” in Proceedings of the 9th Text Retrieval Conference, pp. 655-664, NIST, Gaithersburg, MD, 2001. Ulf Hermjakob, Abdessamad Echihabi and Daniel Marcu, Natural Language Based Reformulation Resource and Web Exploitation for Question Answering Proceedings of TREC2002, Gaithersburg MD, 2003. Hanmin Jung, Gary Geunbae Lee, Multilingual Question Answering with High Portability on Relational Databases Workshop on Multilingual Summarization and Question Answering, COLING 2002 Boris Katz. “Annotating the World Wide Web using natural language”. Proceedings RIAO 1997. Kupiec, J. “Murax: A robust linguistic approach for question answering using an on-line encyclopedia”. Proceedings 16th SIGIR, Pittsburgh, PA 2001. Lenat, D. B. 1995. "Cyc: A Large-Scale Investment in Knowledge Infrastructure." Communications of the ACM 38, no. 11. Mihalcea, R. and Moldovan, D. “A Method for Word Sense Disambiguation of Unrestricted Text”. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), pp. 152-158, College Park, MD, 1999. Miller, G. “WordNet: A Lexical Database for English”, Communications of the ACM 38(11) pp. 39-41, 1995. Dan I. Moldovan and Vasile Rus, ``Logic Form Transformation of WordNet and its Applicability to Question Answering'', Proceedings of the ACL 2001 Conference, July 2001,Toulouse, France. Marius Pasca and Sanda Harabagiu, High Performance Question/Answering, in Proceedings of the 24th Annual International ACL SIGIR Conference on Research and Development in Information Retrieval (SIGIR-2001), September 2001, New Orleans LA, pages 366-374. References:  References John M. Prager, Jennifer Chu-Carroll and Krzysztof Czuba, "A Multi-Strategy, Multi-Question Approach to Question Answering" submitted for publication. Prager, J.M., Chu-Carroll, J., Brown, E.W. and Czuba, K. "Question Answering by Predictive Annotation”, in Advances in Open-Domain Question-Answering", Strzalkowski, T. and Harabagiu, S. Eds., Kluwer Academic Publishers, to appear 2003?. Prager, J.M., Radev, D.R. and Czuba, K. “Answering What-Is Questions by Virtual Annotation”. Proceedings of Human Language Technologies Conference, San Diego CA, March 2001. Prager, J.M., Brown, E.W., Coden, A. and Radev, R. "Question-Answering by Predictive Annotation”. Proceedings of SIGIR 2000, pp. 184-191, Athens, Greece. Radev, D.R., Qi, H., Zheng, Z., Blair-Goldensohn, S., Zhang, Z., Fan, W. & Prager, J.M. “Mining the Web for Answers to Natural Language Questions”, Proceedings of CIKM, Altlanta GA., 2001. Radev, D.R., Prager, J.M. and Samn, V. "Ranking Suspected Answers to Natural Language Questions using Predictive Annotation”. Proceedings of ANLP 2000, pp. 150-157, Seattle, WA. Deepak Ravichandran and Eduard Hovy, “Learning Surface Text Patterns for a Question Answering System”. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002, pp. 41-47. Rosch, E. et al. “Basic Objects in Natural Categories”, Cognitive Psychology 8, pp. 382-439, 1976. Soubbotin, M. “Patterns of Potential Answer Expressions as Clues to the Right Answers” in Proceedings of the 10th Text Retrieval Conference, pp. 293-302, NIST, Gaithersburg, MD, 2002. Soubbotin, M. and Soubbotin, S. “Use of Patterns for Detection of Answer Strings: A Systematic Approach” in Proceedings of the 11th Text Retrieval Conference, pp. 325-331, NIST, Gaithersburg, MD, 2003. References:  References Ellen M. Voorhees and Dawn Tice. 2000. Building a question answering test collection. In 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 200-207, Athens, August. N. Wacholder, Y. Ravin and M. Choi. “Disambiguation of Proper Names in Text”, Proceedings of ANLP’97. Washington, DC, April 1997. Warren, David H.D., & Fernando C.N. Pereira (1982) "An efficient easily adaptable system for interpreting natural language queries," Computational Linguistics, 8:3-4, 110-122. Terry Winograd. 1972. Procedures as a representation for data in a computer program for under-standing natural language. Cognitive Psychology, 3(1).

Add a comment

Related presentations

Related pages

Tschechische Technische Universität Prag – Wikipedia

Das Prager Polytechnikum in der Husova, in Nachbarschaft des Palais Clam-Gallas (Prag) wurde zur einzigen technischen Schule im Kaisertum Österreich.
Read more

Prager Rattler und Prager Rattler Welpen - Kaufen bei ...

Prager Rattler (Hunde und Welpen) bei Deine-Tierwelt.de dem seriösen Online-Tiermarkt.
Read more

Prag - Czech Republic - the official travel site

Jedes der Prager Stadtviertel hat seine einzigartige Atmosphäre und seinen individuellen Zauber. Prag stellt sich Ihnen als bunte Stadt vor, ...
Read more

TUD - MEZ - Prager Moderne

»Tripolis Praga« Die Prager Moderne. Eine Ausstellung Die erste Glanzzeit der ›Prager Moderne‹ endet mit dem Zusammenbruch des Habsburger Reiches und ...
Read more

2004.Prager.Straße - TUD - Masterstudiengang ...

Sekretariat über Gabi Böhme Tel.: +49 351 463-34437 Fax: +49 351 463-35962 ibad@mailbox. tu-dresden.de Sitz: Zellescher Weg 17B, Zi. B 509
Read more

Prag – Wikipedia

Das Prager Becken gehörte während der gesamten Ur- und Frühgeschichte zu den am dichtesten und nahezu ... der beruflich in Prag zu tun hatte, ...
Read more

Prager Rattler Rasseportrait - Charakter, Aussehen und ...

Wissenswertes über die Hunderasse Prager Rattler - Charakter, Aussehen und Züchter im Rasseportrait
Read more

Ueli Prager Sprüche und Weisheiten - Die Welt der Zitate ...

1 Zitat von Ueli Prager: Wir tun nichts Außergewöhnliches, wir sind bloß erfolgreich, weil wir ganz gewöhnliche Dinge ganz außergewöhnlich tu
Read more

Prager Rattler vom Turmalin

Wenn Ihr Hund zittert. Wenn Ihr Hund ohne jeden Grund immer wieder einmal zittert sollten Sie über die Gabe von Magnesium nachdenken. Magnesium ist ...
Read more

Fotografin Alex Prager: Interview mit "Face in the Crowd ...

Alex Prager begann mit einer 80-Dollar-Kamera, heute gelten ihre Bilder als virtuose Kunst. Im Interview erklärt die junge US-Fotografin, warum sie ...
Read more