advertisement

shen 1

50 %
50 %
advertisement
Information about shen 1
Entertainment

Published on October 12, 2007

Author: Mahugani

Source: authorstream.com

advertisement

A Maximum Entropy-based Model for Answer Extraction:  A Maximum Entropy-based Model for Answer Extraction Dan Shen IGK, Saarland University Supervisors: Prof. Dietrich Klakow Dr. ir. Geert-Jan M. Kruijff Part I -- Introduction:  Part I -- Introduction Answer Extraction Module in QA Statistical Method for Answer Extraction Motivation Framework Answer Extraction Module in QA:  Answer Extraction Module in QA Open-Domain factoid Question Answering Basic modules Information Retrieval Module  a set of relevant sentences / paragraphs Answer Extraction (AE) Module  the appropriate answer phrase Q: What is the capital of Japan ? A: Tokyo Q: How far is it from Earth to Mars ? A: 249 million miles Techniques and Resources for AE:  Techniques and Resources for AE  How to incorporate them ? Pipeline structure Mathematical framework Motivation – Use Statistical Methods ?:  Motivation – Use Statistical Methods ? Flexibility Integrating various techniques / resources Easy to extend to span more in the future Effectiveness Research Issues:  Research Issues Answer Candidate Selection Which constituent is regarded as an AC ? Methods classification / ranking / … Features Part II – ME-based model:  Part II – ME-based model Method Features Experiments and Results Part II – ME-based model:  Part II – ME-based model Method Features Experiments and Results Maximum Entropy Formulation I:  Maximum Entropy Formulation I Given a set of answer candidates Model the probability Define Features Functions Decision Rule Maximum Entropy Formulation II:  Maximum Entropy Formulation II Given a set of answer candidates Model the probability Define Features Functions Decision Rule Some Considerations:  Some Considerations Model I Judge whether each candidate is a correct answer √ Can find more than one correct answer in a sentence ? Is the probability comparable ? × Suffer from the unbalanced data set (1Pos / >20Neg) Model II Find the best answer among the candidates × In a sentence, it just find one correct answer √ Directly make the probabilities of the candidates comparable Experiment Model II outperform Model I by about 5% Part II – ME-based model:  Part II – ME-based model Method Features Experiments and Results Question Analysis:  Question Analysis Q: What US biochemists won the Nobel Prize in medicine in 1992 ? Question Word -- what Target Word – biochemist Subject Word -- Nobel Prize / medicine / 1992 Verb – win Q: What is the name of the highest mountain in Africa ? Question Word -- what Target Word -- mountain Subject Words -- highest / Africa Verb -- be PERSON LOCATION Answer Candidate Selection:  Answer Candidate Selection Preprocessing Named Entity Recognition Parsing [Collins Parser] To dependency tree Answer Candidate Selection Base noun phrase Named entities Leaf nodes Answer Candidate Coverage 11876 / 14039 = 84.6 % Missing some sentences  to consider all of the nodes ? Features – Syntactic / POS Tag Features:  Features – Syntactic / POS Tag Features Observation For who / where Question, answers = Proper Noun For how / when Question, answers = CD Question Word × Syntactic tag / Pos tag QWord = “how” & SynTag = “CD” QWord = “who” & SynTag = “NNP” QWord = “when” & SynTag = “NNP” QWord = “when” & SynTag = “CD” … Features – Surface Word Features:  Features – Surface Word Features Word formations Length / Capitalized / Digits, … Question Word × Word formations QWord = “who” & word is capitalized QWord = “who” & word length < 3 Words co-occurrence between Q and A Observation -- Answer aren’t a subsequence of question Features – Named Entity Features:  Features – Named Entity Features Question Type × NE type QType = Person & NE type = Person QType = Date & NE type = Date QType = how much & NE type = Money … Useful for who, where, when … Question But for What / Which / How questions ? Many expected answer types not belong to a defined NE type Q1: What language is most commonly used in Bombay ? Q2: What city is … Q3: Which movie win …. Features – TWord Relation for WHAT I:  Features – TWord Relation for WHAT I TWord is a hypernym of answer TWord is the head of answer Q: What is the name of the airport in Dallas Ft. Worth ? A: Wednesday morning , the low temperature at the Dallas-Fort Worth International Airport was 81 degrees . Q: What city is Disneyland in ? A: Not bad for a struggling actor who was working at Tokyo Disneyland just a few years ago . Features – TWord Relation for WHAT II:  Features – TWord Relation for WHAT II TWord is the Appositive of answer Feature Function QWord = what & TWord is hypernym of answer candidate … Q: What book did Rachel Carson write in 1962 ? A1: In her 1962 book Silent Spring , Rachel Carson , a marine biologist , chronicled DDT 's poisonous effects , …. A2: In 1962 , former U.S. Fish and Wildlife Service biologist Rachel Carson shocked the nation with her landmark book , Silent Spring . Features – Tword Relation for HOW:  Features – Tword Relation for HOW How many / much + NN … How long / far / tall / fast … How long …  year / day / month / … How tall …  feet / inch / mile / … How fast …  per day / per hour / … Use some trigger word features Q: How many time zones are there in the world ? A: The world is divided into 24 time zones . Features – Subject Word Relations I:  Features – Subject Word Relations I Q: Who invented the paper clip ? S1: The paper clip , weighing a desk-crushing 1320 pounds , is a faithful copy of Norwegian Johan Vaaler ‘s 1899 invention, said … S2: “ Like the guy who invented the safety pin , or the guy who invented the paper clip “ , David says . × Features – Subject Word Relations II:  Features – Subject Word Relations II Match subject word in the answer sentence Minimal Edit Distance Dependency Relationship Matching Observation – answer are close to SWord in Dependency Tree  answer and SWord have some relation Answer candidate is a subject word Answer candidate is the parent / child / brother of SWord The path from the answer candidate to SWord Q: What is the name of the airport in Dallas Ft. Worth ? A: Wednesday morning , the low temperature at the Dallas-Fort Worth International Airport was 81 degrees Part II – ME-based model:  Part II – ME-based model Method Features Experiments and Results Experiment Settings:  Experiment Settings Training Data TREC 1999, TREC 2000, TREC 2002 Total Number of Questions: 1108 Total Number of Sentences: 11331 Test Data TREC 2003 Total Number of Questions: 362 (remove NIL question) Total Number of Sentences: 2708 Question Word Distribution:  Question Word Distribution Overall Performance:  Overall Performance MRR – Mean Reciprocal Rank return five answers for each question Contribution of Different Features:  Contribution of Different Features Features – Syntactic / POS Tag Features:  Features – Syntactic / POS Tag Features Features – + Surface Word Features:  Features – + Surface Word Features Features – + Named Entity Features:  Features – + Named Entity Features Features – + TWord Relations for WHAT:  Features – + TWord Relations for WHAT Features – + TWord Relations for HOW:  Features – + TWord Relations for HOW Features – + Subject Word Relations:  Features – + Subject Word Relations Error Analysis – I:  Error Analysis – I Target Word Concept Unresolved Q: What is the traditional dish served at Wimbledon? √A: And she said she wasn't wild about Wimbledon 's famed strawberries and cream . ×A: And she said she wasn't wild about Wimbledon 's famed strawberries and cream . Choosing the Wrong Entity Q: What actress has received the most Oscar nominations? √A: Oscar perennial Meryl Streep is up for best actress for the film , tying Katharine Hepburn for most acting nominations with 12 . ×A: Oscar perennial Meryl Streep is up for best actress for the film , tying Katharine Hepburn for most acting nominations with 12 . Error Analysis – II:  Error Analysis – II Answer Candidate Granularity Q: What city is Disneyland in? √A: Not bad for a struggling actor who was working at Tokyo Disneyland just a few years ago . ×A: Not bad for a struggling actor who was working at Tokyo Disneyland just a few years ago . Repeated Target Word in Answer Q: How many grams in an ounce? √A: NOTE : 30 grams is about 1 ounce . ×A: NOTE : 30 grams is about 1 ounce . Misc. Future Work:  Future Work Extract answer from Web Evaluate on other data sets Knowledge Master Corpus How to deal with NIL question ? Incorporate more linguistic-motivated features The End:  The End

Add a comment

Related presentations

Related pages

Shen Neng 1 – Wikipedia

Die Shen Neng 1 ist ein chinesischer Massengutfrachter. Das im Jahr 1993 bei Sumtomo Heavy Industries in Oppama als Bestor gebaute Schiff ist 225,00 Meter ...
Read more

sehen – Wiktionary

Referenzen und weiterführende Informationen: [1] Wikipedia-Artikel „sehen“ [1, 2] Jacob Grimm, Wilhelm Grimm: Deutsches Wörterbuch. 16 Bände in 32 ...
Read more

Shen Yun Performing Arts

Offizielle Webseite des weltweit führenden klassischen chinesischen Tanz-Ensembles Shen Yun Performing Arts. Tickets, Videos, Zuschauerstimmen ...
Read more

Der letzte Bulle - Serie - SAT.1 - TV-Programm, Videos und ...

Der letzte Bulle: TV-Termine, ganze Folgen sehen, Episodenguide, Bilder und Musik-Infos zur Kultserie mit Henning Baum.
Read more

SAT.1 Mediathek - Video - SAT.1 Sendungen online sehen

Sendung verpasst? In der SAT.1 Mediathek findest du Ganze Folgen, Highlight-Videos und exklusives Bonusmaterial deiner SAT.1-Sendungen.
Read more

Serien online ‣ kostenlos und in voller Länge auf Clipfish

1; 2; 3; 4; Weiter; Serien in der OV Coupling - OV Jahr: 2000; Land: Großbritannien; Genre: SERIEN; FSK: 12; Cast: Jack Davenport, Gina Bellman, Sarah ...
Read more

Brillen, Lesebrillen, Kontaktlinsen hier bei ihrem ...

Brillen in Ihrer Sehstärke, Lesebrillen und Kontaktlinsen hier bei Ihrem Internetoptiker. Unser Service: Einschleifen neuer Brillengläser in Ihre ...
Read more

Konjugation | sehen | Deutsche Verben konjugieren

Konjugation des Verbs sehen | sieht - sah - hat gesehen | Alle Formen von sehen als Tabelle ... sehen +1: folgen: 380: V Weitere Formen. sich sehen:: reflexiv
Read more

Kontaktlinsen wie Air Optix Aqua, Pure Vision und ...

Kontaktlinsen, Tageslinsen, Monatslinsen und Jahreslinsen wir haben alles für Kontaktlinsenträger und das günstig.
Read more