M Surdeanu

50 %
50 %
Information about M Surdeanu
Entertainment

Published on June 26, 2007

Author: FunnyGuy

Source: authorstream.com

Question Answering Techniques and Systems:  Question Answering Techniques and Systems Mihai Surdeanu TALP Research Center Dep. Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya surdeanu@lsi.upc.edu What is the Catalan language?:  What is the Catalan language? What is the Catalan language?:  What is the Catalan language? What is the longest ruling dynasty of Japan?:  What is the longest ruling dynasty of Japan? What is the longest ruling dynasty of Japan?:  What is the longest ruling dynasty of Japan? I don’t want to learn Boolean logic: (dynasty AND (Japan OR Japanese) AND (NOT tempura)) Overview:  Overview What is Question Answering? Generic architectures Other relevant approaches and systems Overview:  Overview What is Question Answering? Definition, evaluation, classes of questions Generic architectures Other relevant approaches and systems Problem of Question Answering:  Problem of Question Answering What is the nationality of Pope John Paul II? … stabilize the country with its help, the Catholic hierarchy stoutly held out for pluralism, in large part at the urging of Polish-born Pope John Paul II. When the Pope emphatically defended the Solidarity trade union during a 1987 tour of the… Natural language question, not keyword queries Short text fragment, not URL list Beyond Document Retrieval:  Document Retrieval Users submit queries corresponding to their information needs. System returns (voluminous) list of full-length documents. It is the responsibility of the users to find information of interest within the returned documents. Open-Domain Question Answering (QA) Users ask questions in natural language. What is the highest volcano in Europe? System returns list of short answers. … Under Mount Etna, the highest volcano in Europe, perches the fabulous town … Often more useful for specific information needs. Beyond Document Retrieval Evaluating QA Systems:  Evaluating QA Systems National Institute of Standards and Technology (NIST) organizes yearly the Text Retrieval Conference (TREC), which has had a QA track for the past 7 years: from 1999 to 2005. Recently, the Cross-Language Evaluation Forum (CLEF) organizes a similar evaluation in Europe. The document set Newswire textual documents from LA Times, San Jose Mercury News, Wall Street Journal, NY Times etcetera: over 1M documents now. Well-formed lexically, syntactically and semantically (were reviewed by professional editors). The questions Hundreds of new questions every year, the total is about 2500 for all TRECs. Task Initially extract at most 5 answers: long (250B) and short (50B). Now extract only one exact answer. Several other sub-tasks added later: definition, list, context. Metrics Mean Reciprocal Rank (MRR): each question assigned the reciprocal rank of the first correct answer. If correct answer at position k, the score is 1/k. Classes of QA Systems (1/2):  Classes of QA Systems (1/2) Capable of processing factual questions Exact answer exists in text snippets Answers extracted through keyword manipulations, maybe some morphological operations With simple reasoning mechanisms Exact answer exists in text snippets Some inference required to link answer to question World and domain knowledge (ontologies) necessary 'How did Socrates die?'  '… was poisoned…' Classes of QA Systems (2/2):  Classes of QA Systems (2/2) Capable of answer fusion Exact answer does not exist in a single text fragment, but scattered across multiple documents 'How do I assemble a bicycle?' 'Who was president of the United States during the recession?' Interactive systems User-system dialog interaction Require discourse processing, coreference resolution etc. 'What ocean is between Europe and US?' 'How wide?' Capable of analogical reasoning Can answer speculative questions where the answer is not explicit in the documents Systems extract pieces of evidence and use analogical reasoning 'Is the US out of recession?' Overview:  Overview What is Question Answering? Generic architectures For factual questions For definitional questions For complex, temporal questions Other relevant approaches and systems QA Block Architecture:  QA Block Architecture Question Processing Passage Retrieval Answer Extraction WordNet NER Parser WordNet NER Parser Document Retrieval Keywords Passages Question Semantics Q A Question Processing:  Question Processing Understand the expected answer type Extract and prioritize question keywords to Answer Extraction to Passage Retrieval Question Processing Question Stems and Answer Type Examples:  Question Stems and Answer Type Examples Other question stems: Who, Which, Name, How hot... Other answer types: Country, Number, Product... Identify the semantic category of expected answers Lexical Terms Examples:  Lexical Terms Examples Questions approximated by sets of unrelated words (lexical terms) Similar to bag-of-word IR models Detecting the Expected Answer Type:  Detecting the Expected Answer Type In some cases, the question stem is sufficient to indicate the answer type (AT) Why  REASON When  DATE In many cases, the question stem is ambiguous What was the name of Titanic’s captain ? What U.S. Government agency registers trademarks? What is the capital of Kosovo? Question reformulations are hard to be handled with manually-crafted rules What tourist attractions are there in Barcelona? What are the names of the tourist attractions in Barcelona? What do most tourists visit in Barcelona? Detecting the Expected Answer TypeSMU’s Approach:  Detecting the Expected Answer Type SMU’s Approach Converts the question parse to a graph-like 'semantic representation' The node with the highest connectivity is the 'question focus word' (QFW) What was the name of Titanic’s captain ? What U.S. Government agency registers trademarks? What is the capital of Kosovo? All hyponyms of certains question focus words are assigned the same class Building the Question Representation:  Building the Question Representation from the question parse tree, bottom-up traversal with a set of propagation rules Why did David Koresh ask the FBI for a word processor WRB VBD NNP NNP VB DT NNP IN DT NN NN WHADVP NP NP NP PP VP SQ SBARQ - assign labels to non-skip leaf nodes propagate label of head child node, to parent node link head child node to other children nodes Building the Question Representation:  Building the Question Representation from the question parse tree, bottom-up traversal with a set of propagation rules Why did David Koresh ask the FBI for a word processor WRB VBD NNP NNP VB DT NNP IN DT NN NN WHADVP NP NP NP PP VP SQ SBARQ Question representation David Koresh ask FBI word processor REASON AT Detection Algorithm:  AT Detection Algorithm Select the question focus word from the question representation: Select the word(s) connected to the question. Some content-free words are skipped (e.g. 'name'). From the previous set select the word with the highest connectivity in the question representation. Map the AT word in a previously built AT hierarchy The AT hierarchy is based on WordNet, with some concepts associated with semantic categories, e.g. 'writer'  PERSON. Select the AT(s) from the first hypernym(s) associated with a semantic category. Answer Type Hierarchy:  Answer Type Hierarchy PERSON PERSON Evaluation of Answer Type Hierarchy:  Evaluation of Answer Type Hierarchy Controlled variation of the number of WordNet synsets included in answer type hierarchy. Test on 800 TREC questions. 0% 0.296 3% 0.404 10% 0.437 25% 0.451 50% 0.461 Precision score (50-byte answers) Hierarchy coverage The derivation of the answer type is the main source of unrecoverable errors in the QA system Discussion (SMU Approach):  Discussion (SMU Approach) Advantages Robust, handles a large variety of paraphrases. Can be easily customized: just add AT categories to WordNet synsets. Disadvantages Mapping from answer types to WordNet synsets constructed entirely by hand. Without robust Word Sense Disambiguation (WSD), which synsets should be marked in WordNet? Example: the noun 'plant' has 4 WordNet senses. The first two are: 'industrial plant' and 'flora'. Obviously, they point to distinct ATs. What about words that do not appear in WordNet? Does not handle ambiguity too well: if a QFW maps to more than one answer type, all get the same priority. Detecting the Expected Answer TypeUIUC’s Approach:  Detecting the Expected Answer Type UIUC’s Approach Treats the problem as a typical machine learning (ML) classification task: A taxonomy of question classes is defined offline; Training data for each question class is annotated; Classifiers are trained for each question class using a rich set of features. Answer Type Taxonomy (1/2):  Answer Type Taxonomy (1/2) Two-layered taxonomy 6 coarse classes ABBEVIATION, ENTITY, DESCRIPTION, HUMAN, LOCATION, NUMERIC_VALUE 50 fine classes HUMAN: group, individual, title, description ENTITY: animal, body, color, currency… LOCATION: city, country, mountain… Answer Type Taxonomy (2/2):  Answer Type Taxonomy (2/2) Answer Type Examples:  Answer Type Examples coarse fine The Ambiguity Problem:  The Ambiguity Problem The classification of a specific question can be quite ambiguous Examples 'What is bipolar disorder?'  definition OR disease 'What do bats eat?'  food OR plant OR animal 'What is the PH scale?'  numeric_value OR definition Solution: allow assignment of multiple class labels for a single question! Classifier Features:  Classifier Features Lexical features The question words ('When'  date) N-grams of question words ('How_long'  measure) Syntactic features Part-of-speech tags of all question words (unigrams, N-grams) Sequence of phrases in the question (unigrams, N-grams) Head words of question phrases ('captain'  individual). Unigrams and N-grams of head words. Semantic features Named entities in the question Semantically related words ('away' related to 'distance'  measure) Dekang Lin’s proximity-based database (http://www.cs.ualberta.ca/~lindek/). A small semantic database developed inhouse. Classification Results:  Classification Results more features Trained on (only) 5500 questions. Tested on 500 questions. Used a perceptron-based ML system. Top 5 classes used. If only the best class considered, accuracy is ~88%. Discussion (UIUC Approach):  Discussion (UIUC Approach) Advantages Performs better than the numbers reported by SMU (informal communication) Elegant framework to handle the ambiguity problem Easy to train, no linguistic experience necessary Not (so) sensitive to WSD ambiguities, because it makes its decisions based on a larger context. Disadvantages Somewhat harder to customize: need ~100 question examples for each new class Question Processing:  Question Processing Understand the expected answer type Extract and prioritize question keywords to Answer Extraction to Passage Retrieval Question Processing Keyword Selection:  Keyword Selection AT indicates what the question is looking for, but provides insufficient context to locate the answer in very large document collection Lexical terms (keywords) from the question, possibly expanded with lexical/semantic variations provide the required context Keyword Selection Algorithm:  Keyword Selection Algorithm Select all non-stop words in quotations 10 Select all NNP words in recognized named entities  9 Select all complex nominals with their adjectival modifiers  8 Select all other complex nominals  7 Select all adjectival modifiers  6 Select all other nouns  5 Select all verbs  4 Select all adverbs  3 Select the QFW word (which was skipped in all previous steps)  2 Select all other words  1 Walk-through Example:  Walk-through Example Who coined the term 'cyberspace' in his novel 'Neuromancer'? cyberspace/10 Neuromancer/10 term/7 novel/7 coined/4 Keyword Selection Examples:  Keyword Selection Examples What researcher discovered the vaccine against Hepatitis-B? Hepatitis-B, vaccine, discover, researcher What is the name of the French oceanographer who owned Calypso? Calypso, French, own, oceanographer What U.S. government agency registers trademarks? U.S., government, trademarks, register, agency What is the capital of Kosovo? Kosovo, capital Passage Retrieval:  Passage Retrieval Question Processing Passage Retrieval Answer Extraction WordNet NER Parser WordNet NER Parser Document Retrieval Keywords Passages Question Semantics Q A Passage Retrieval Architecture:  Passage Retrieval Architecture Passage Extraction Passage Quality Keyword Adjustment Passage Scoring Passage Ordering Keywords No Passages Yes Documents Document Retrieval Ranked Passages Adjust the query to retrieve more/fewer passages We may want/need to eliminate some passages… Passage Extraction Loop:  Passage Extraction Loop Passage Extraction Component Extracts passages that contain all selected keywords Passage size dynamic. Start position dynamic. If the passage size or offset are static you may end the passage in the middle of the answer, or the answer context! Passage quality and keyword adjustment In the first iteration use the first 6 keyword selection heuristics If the number of passages is lower than a threshold  query is too strict  drop a keyword If the number of passages is higher than a threshold  query is too relaxed  add a keyword Passage Scoring (1/2) :  Passage Scoring (1/2) Passages are scored based on keyword windows For example, if a question has a set of keywords: {k1, k2, k3, k4}, and in a passage k1 and k2 are matched twice, k3 is matched once, and k4 is not matched, the following windows are built: k1 k2 k3 k2 k1 Window 1 k1 k2 k3 k2 k1 Window 2 k1 k2 k3 k2 k1 Window 3 k1 k2 k3 k2 k1 Window 4 Passage Scoring (2/2):  Passage Scoring (2/2) Passage ordering is performed using a radix sort that involves three scores: largest SameWordSequenceScore, smallest DistanceScore, smallest MissingKeywordScore. SameWordSequenceScore Computes the number of words from the question that are recognized in the same sequence in the window. Intuition: passages with the same keyword order as the question are better. DistanceScore The number of words that separate the most distant keywords in the window. Intuition: passages with denser keywords are better. MissingKeywordScore The number of unmatched keywords in the window Intuition: passages with fewer missing keywords are better. Essentially it is an optimization step  drop passages below a certain threshold  speed improvement! If top 1000 passages are maintained more than 80% of the questions have at least 1 correct passage. Answer Extraction:  Answer Extraction Question Processing Passage Retrieval Answer Extraction WordNet NER Parser WordNet NER Parser Document Retrieval Keywords Passages Question Semantics Q A Ranking Candidate Answers:  Ranking Candidate Answers Answer type: Person Text passage: 'Among them was Christa McAuliffe, the first private citizen to fly in space. Karen Allen, best known for her starring role in 'Raiders of the Lost Ark', plays McAuliffe. Brian Kerwin is featured as shuttle pilot Mike Smith...' Best candidate answer: Christa McAuliffe Q066: Name the first private citizen to fly in space. Features for Answer Ranking:  Features for Answer Ranking relNMW – number of question terms matched in the answer passage relSP – number of question terms matched in the same phrase as the candidate answer relSS – number of question terms matched in the same sentence as the candidate answer relFP – flag set to 1 if the candidate answer is followed by a punctuation sign relOCTW – number of question terms matched, separated from the candidate answer by at most three words and one comma relSWS – number of terms occurring in the same order in the answer passage as in the question relDTW – average distance from candidate answer to question term matches Robust heuristics that work on unrestricted text! Answer Ranking based on Machine Learning:  Answer Ranking based on Machine Learning Relative relevance score computed for each pair of candidates (answer windows) relPAIR = wSWS  relSWS + wFP  relFP + wOCTW  relOCTW + wSP  relSP + wSS  relSS + wNMW  relNMW + wDTW  relDTW + threshold if relPAIR positive, then first candidate from pair is more relevant Perceptron model used to learn the weights published by Marius Pasca, SIGIR 2001 Scores in the 50% MRR for short answers (50 bytes), in the 60% MRR for long answers (250 bytes) MRR – reciprocal rank of the first correct answer, e.g. 1/3 of the first correct answer is on the third position Evaluation on the Web:  Evaluation on the Web test on 350 questions from TREC (Q250-Q600) extract 250-byte answers Overview:  Overview What is Question Answering? Generic architectures For factual questions For definitional questions For complex, temporal questions Other relevant approaches and systems System Extension:Definition Questions:  System Extension: Definition Questions Definition questions ask about the definition or description of a concept: Who is John Galt? What is anorexia nervosa? Many 'information nuggets' are acceptable answers Who is George W. Bush? … George W. Bush, the 43rd President of the United States… George W. Bush defeated Democratic incumbent Ann Richards to become the 46th Governor of the State of Texas… Scoring Any information nugget is acceptable Precision score over all information nuggets Answer Detection with Pattern Matching:  Answer Detection with Pattern Matching Answer Detection with Concept Expansion:  Answer Detection with Concept Expansion Problem: lexico/syntactic patterns have the tendency to over-match  need additional semantic constraints Solution: Favor patterns where the AP is semantically related to the phrase to define WordNet hypernyms (more general concepts) Evaluation on Definition Questions:  Evaluation on Definition Questions Determine the impact of answer type detection with pattern matching and concept expansion test on the Definition questions from TREC-9 and TREC-10 (approx. 200 questions) extract 50-byte answers Results precision score: 0.56 questions with a correct answer among top 5 returned answers: 0.67 Overview:  Overview What is Question Answering? Generic architectures For factual questions For definitional questions For complex, temporal questions Other relevant approaches and systems Simple and Complex Temporal Questions:  Simple and Complex Temporal Questions The previous factual QA system can answer simple temporal questions, where the AT is a date, or that include simple temporal expressions 'When did Bob Marley die?' 'Who won the U.S. Open in 1999?' This system can not answer more complex questions that require detection of temporal properties or event ordering: 'Who was spokesman of the Soviet Embassy in Baghdad during the invasion of Kuwait?' 'Is Bill Clinton currently the President of the United States?' Temporal Question Taxonomy (1/2):  Temporal Question Taxonomy (1/2) Simple Temporal Questions: Type 1: Single-event temporal questions without temporal expressions (TE). 'When did Jordan close the port of Aqaba to Kuwait?' Type 2: Single-event temporal questions with temporal expressions. 'Who won the 1998 New Hampshire republican primary?' Temporal Question Taxonomy (2/2):  Temporal Question Taxonomy (2/2) Complex temporal questions Type 3: Multiple-event temporal questions with temporal expression. 'What did George Bush do after the U.N. Security Council ordered a global embargo on trade with Iraq in August 1990?' Temporal signal: 'after' Temporal constraint: 'between 1/8/1990 and 31/8/1990' Type 4: Multiple-event temporal questions without temporal expression. 'What happened to oil prices after the Iraqi annexation of Kuwait?' Temporal signal: 'after' Approach Overview:  Approach Overview Decompose the question into simpler factual questions 'Who was spokesman of the Soviet Embassy in Baghdad during the invasion of Kuwait?' 'Who was spokesman of the Soviet Embassy in Baghdad?' 'When did the invasion of Kuwait occur?' Look for all possible answers to the first question. Look for all possible answers to the second question. Give as final answer one the answers to the first question whose associated date is consistent with the answer to the second question. Architecture of the Temporal QA System:  Architecture of the Temporal QA System Decision Tree for Type Identification:  Decision Tree for Type Identification Temporal signals: After, When, Before, During, While, For… Algorithm for Question Splitting:  Algorithm for Question Splitting Question Splitting Examples:  Question Splitting Examples 'Where did Bill Clinton study before going to Oxford University?' Temporal signal: 'before' Q1: 'Where did Bill Clinton study?' Q2: 'When did Bill Clinton go to Oxfort University?' 'What did George Bush do after the U.N. Security Council ordered a global embargo on trade with Iraq in August 1990?' Temporal signal: 'after' Temporal expression: 'in August 1990' Q1: 'What did George Bush do?' Q2: 'When did the U.N. Security Council order a global embargo on trade with Iraq in August 1990?' Question Decomposition Evaluation:  Question Decomposition Evaluation Overview:  Overview What is Question Answering? Generic architectures For factual questions For definitional questions For complex, temporal questions Other relevant approaches and systems LCC´s PowerAnswer + COGEX IBM’s PIQUANT CMU’s Javelin ISI’s TextMap BBN’s AQUA PowerAnswer + COGEX (1/2):  PowerAnswer + COGEX (1/2) Automated reasoning for QA: A  Q, using a logic prover. Facilititates both answer validation and answer extraction. Both question and answer(s) transformed in logic forms. Example: Heavy selling of Standard andamp; Poor’s 500-stock index futures in Chicago relentlessly beat stocks downwards. Heavy_JJ(x1) andamp; selling_NN(x1) andamp; of_IN(x1,x6) andamp; Standard_NN(x2) andamp; andamp;_CC(x13,x2,x3) andamp; Poor(x3) andamp; ‘s_POS(x6,x13) andamp; 500-stock_JJ(x6) andamp; index_NN(x4) andamp; futures(x5) andamp; nn_NNC(x6,x4,x5) andamp; in_IN(x1,x8) andamp; Chicago_NNP(x8) andamp; relentlessly_RB(e12) andamp; beat_VB(e12,x1,x9) andamp; stocks_NN(x9) andamp; downward_RB(e12) PowerAnswer + COGEX (2/2):  PowerAnswer + COGEX (2/2) World knowledge from: WordNet glosses converted to logic forms in the eXtended WordNet (XWN) project (http://www.utdallas.edu/~moldovan) Lexical chains game:n#3  HYPERNYM  recreation:n#1  HYPONYM  sport:n#1 Argentine:a#1  GLOSS  Argentina:n#1 NLP axioms to handle complex NPs, coordinations, appositions, equivalence classes for prepositions etcetera … Barcelona, the capital of Catalonia, … Capital AND Catalonia  Barcelona Named-entity recognizer John Galt  HUMAN A relaxation mechanism is used to iteratively uncouple predicates, remove terms from LFs. The proofs are penalized based on the amount of relaxation involved. PowerAnswer: Discussion:  PowerAnswer: Discussion Advantages Elegant, formal mechanism for QA Proves if an answer is correct or not, rather than offering answer ranking Disadvantages Requires many NLP tools: complete syntactic analysis, WSD  prone to errors, slow Can not handle non-monotonic language constructs Monotonicity of QA: if A answers Q, than adding more words to A does not change the fact that A is still a correct answer. Not true for: negations, non-factive verbs (claim, think), numeric missmatches, etc. Where is Barcelona located? Barcelona is not located in France. I think Barcelona is in France. IBM’s Piquant:  IBM’s Piquant Question processing conceptually similar to SMU, but a series of different strategies ('agents') available for answer extraction. For each question type, multiple agents might run in parallel. Reasoning engine and general-purpose ontology from Cyc used as sanity checker. Answer resolution: remaining answers are normalized and a voting strategy is used to select the 'correct' (meaning most redundant) answer. Piquant QA Agents:  Piquant QA Agents Predictive annotation agent 'Predictive annotation' = the technique of indexing named entities and other NL constructs along with lexical terms. Lemur has built-in support for this now. General-purpose agent, used for almost all question types. Statistical Query Agent Derivation from a probabilistic IR model, also developed at IBM. Also general-purpose. Description Query Generic descriptions: appositions, parenthetical expressions. Applied mostly to definition questions. Structured Knowledge Agent Answers from WordNet/Cyc. Applied whenever possible. Pattern-Based Agent Looks for specific syntactic patterns based on the question form. Applied when the answer is expected in a well-structured form. Dossier Agent For 'Who is X?' questions. A dynamic set of factual questions used to learn 'information nuggets' about persons. Pattern-based Agent:  Pattern-based Agent Motivation: some questions (with or without AT) indicate that the answer might be in a structured form What does Knight Rider publish?  transitive verb, missing object. Knight Rider publishes X. Patterns generated: From a static pattern repository, e.g. birth and death dates recognition. Dynamically from the question structure. Matching of the expected answer pattern with the actual answer text is not at word level, but at a higher linguistic level based on full parse trees. Informal communication: about 5% of TREC questions are answered by this agent. Dossier Agent:  Dossier Agent Addresses 'Who is X?' questions. Generates initially a series of generic questions: When was X born? What was X’s profession? Future iterations dynamically decided based on the previous answers: If X’s profession is 'writer' the next question is: What did X write? A static ontology of biographical questions used. CyC Sanity Checker:  CyC Sanity Checker Post-processing component that Rejects insane answers 'How much does a grey wolf weigh?' '300 tons' A grey wold IS-A wolf. Weight of a wolf known in Cyc. Cyc returns: SANE, INSANE, or DON’T KNOW. Boosts answer confidence when the answer is SANE. Typically called for numerical answer types: What is the population of Maryland? How much does a grey wolf weigh? How high is Mt. Hood? Answer Resolution:  Answer Resolution Called when multiple agents are applied for the same question. Distribution of agents: the predictive-annotation and the statistical agent by far the most common. Each agent provides a canonical answer (e.g. normalized named entity) and a confidence score. Final confidence for each candidate answer computed using a ML model with SVM, which uses the answers and confidences provided by the individual agents as input. CMU’s Javelin:  CMU’s Javelin Architecture combines SMU’s and IBM’s approaches. Question processing close to SMU’s approach. Passage retrieval loop conceptually similar to SMU’s, but an elegant implementation. Multiple answer strategies similar to IBM’s system. All of them are based on ML models (K nearest neighbours, decision trees) that use shallow-text features (close to SMU’s). Answer voting, similar to IBM’s, used to exploit answer redundancy. Javelin’s Retrieval Strategist:  Javelin’s Retrieval Strategist Implements passage retrieval, including the passage retrieval loop. Uses the Inquiry IR system, probably Lemur by now. The retrieval loop uses all keywords in close proximity of each other initially (stricter than SMU). Subsequent iterations relax the following query terms Proximity for all question keywords: 20, 100, 250, AND Phrase proximity for phrase operators: less than 3 words or PHRASE Phrase proximity for named entities: less than 3 words or PHRASE Inclusion/exclusion of AT word Discarding other keywords Accuracy for TREC-11 queries: how many questions had at least one correct document in the top N documents: Top 30 docs: 80% Top 60 docs: 85% Top 120 docs: 86% ISI’s TextMap: Pattern-Based QA:  ISI’s TextMap: Pattern-Based QA Examples Who invented the cotton gin? andlt;whoandgt; invented the cotton gin andlt;whoandgt;'s invention of the cotton gin andlt;whoandgt; received a patent for the cotton gin How did Mahatma Gandhi die? Mahatma Gandhi died andlt;howandgt; Mahatma Gandhi drowned andlt;whoandgt; assassinated Mahatma Gandhi Patterns generated from the question form (similar to IBM), learned using a pattern discovery mechanism, or added manually to a pattern repository The pattern discovery mechanism performs a series of generalizations from annotated examples: Babe Ruth was born in Baltimore, on February 6, 1895. PERSON was born *g* on DATE TextMap: QA  Machine Translation:  TextMap: QA  Machine Translation In machine translation, one collects translations pairs (s, d) and learns a model how to transform the source s into the destination d. QA is redefined in a similar way: collect question-answer pairs (a, q) and learn a model that computes the probability that a question is generated from the given answer: p(q | parsetree(a)). The correct answer maximizes this probability. Only the subsets of answer parse trees where the answer lies are used as training (not the whole sentence). An off-the-shelf machine translation package (Giza) used to train the model. TextMap:Exploiting the Data Redundancy:  TextMap: Exploiting the Data Redundancy Additional knowledge resources are used whenever applicable WordNet glosses What is a meerkat? www.acronymfinder.com What is ARDA? Etcetera The 'known' answers are then simply searched in the document collection together with question keywords Google is used for answer redundancy TREC and Web (through Google) are searched in parallel. Final answer selected using a maximum entropy ML model. IBM introduced redundancy for QA agents, ISI uses data redundancy. BBN’s AQUA:  BBN’s AQUA Factual system converts both question and answer to a semantic form (close to SMU’s) Machine learning used to measure the similarity of the two representations. Was ranked best at the TREC definition pilot organized before TREC-12 Definition system conceptually close to SMU’s Had pronominal and nominal coreference resolution Used a (probably) better parser (Charniak) Post-ranking of candidate answers using a tf * idf model References (1/2):  References (1/2) Marius Paşca. High-Performance, Open-Domain Question Answering from Large Text Collections, Ph.D. Thesis, Computer Science and Engineering Department, Southern Methodist University, Defended September 2001, Dallas, Texas Marius Paşca. Open-Domain Question Answering from Large Text Collections, Center for the Study of Language and Information (CSLI Publications, series: Studies in Computational Linguistics), Stanford, California, Distributed by the University of Chicago Press, ISBN (Paperback): 1575864282, ISBN (Cloth): 1575864274. 2003 Dan Moldovan, Sanda Harabagiu, Marius Pasca, Rada Mihalcea, Richard Goodrum, Roxana Girju, and Vasile Rus . LASSO: A Tool for Surfing the Answer Net, Text Retrieval Conference (TREC-8), 1999 References (2/2):  References (2/2) E. Nyberg, T. Mitamura, J.Carbonell, J. Callan, K. Collins-Thompson, K. Czuba, M. Duggan, L. Hiyakumoto, N. Hu, Y. Huang, J. Ko, L.V. Lita, S. Murtagh, V. Pedro, D. Svoboda . The JAVELIN Question Answering System at TREC 2002, Text Retrieval Conference, 2002 Xin Li and Dan Roth. Learning Question Classifiers: The Role of Semantic Information, Natural Language Engineering, 2004 E. Saquete, P. Martinez-Barco, R. Munoz, J.L. Vicedo. Splitting Complex Temporal Questions for Question Answering Systems, ACL 2004 End:  End Gràcies!

Add a comment

Related presentations

Related pages

Mihai Surdeanu - Google Scholar Citations

M Surdeanu, R Johansson, A Meyers, L Màrquez, J Nivre. Proceedings of the Twelfth Conference on Computational Natural Language ..., 2008. 360: 2008:
Read more

Mihai Surdeanu

Mihai Surdeanu, Massimiliano Ciaramita, Hugo Zaragoza. Journal: Computational Linguistics ... M. Surdeanu, L. Marquez, X. Carreras, P. R. Comas.
Read more

Mihai Surdeanu

Mihai Surdeanu, Thomas Hicks, and Marco A. Valenzuela-Escarcega. Two Practical Rhetorical Structure Theory Parsers. Proceedings of the Conference of the ...
Read more

Mihai Surdeanu (@msurd) | Twitter

The latest Tweets from Mihai Surdeanu (@msurd). working on #nlproc at University of Arizona
Read more

Andrada Ioana Surdeanu - Tennis Explorer

Surdeanu Andrada Ioana - profile . Surdeanu Andrada Ioana. Country: Romania. ... Surdeanu A. - Galben M. Q-R16: 7-5, 6-2: 1.12: 5.12: 19.07. Surdeanu A ...
Read more

M. Surdeanu, L. Marquez, X. Carreras and P. R. Comas (2007 ...

M. Surdeanu, L. Marquez, X. Carreras and P. R. Comas (2007) "Combination Strategies for Semantic Role Labeling", Volume 29, pages 105-151
Read more

Mihai Surdeanu | LinkedIn

View Mihai Surdeanu's professional profile on LinkedIn. LinkedIn is the world's largest business network, helping professionals like Mihai Surdeanu ...
Read more

Multi-instance Multi-label Learning for Relation Extraction

Multi-instance Multi-label Learning for Relation Extraction Mihai Surdeanu y, Julie Tibshirani , Ramesh Nallapati?, Christopher D. Manningy y Stanford ...
Read more

CLASA PREGATITOARE COLINDE SI POEZII M SURDEANU - YouTube

SCOALA AUREL VLAICU BRAILA 2012 ... This feature is not available right now. Please try again later.
Read more

CiteSeerX — Citation Query Svms for the temporal ...

Svms for the temporal expression chunking problem (2006) by J Poveda, M Surdeanu Add To MetaCart
Read more