chalmers

50 %
50 %
Information about chalmers
News-Reports
uib

Published on September 14, 2007

Author: Alien

Source: authorstream.com

Detecting CoreferenceProcessing discourse reference:  Detecting Coreference Processing discourse reference Christer Johansson, UiB Those little words:  Those little words Typical anaphora Pronouns The cat bit the dog because it was angry. Definite Nouns I saw a cat and a dog. The cat was chasing the dog. Those little words:  Those little words Pronouns We often think of pronouns as some kind of place holder, or a variable. Pronouns have some internal structure Gender Number Case ... Those little words:  Those little words Nouns Do not confuse words with what they refer to. 'Pigs can fly.' is a grammatical sentence even if it is not true. It is also meaningless if we do not know what 'pig' is, or the relations between all three words. Words like nouns are in a sense also variables, although they are more restricted. Compare 'cat' with 'it'. More Anaphora:  More Anaphora Less typical Predication John drives a taxi, and Joe studies math. Which one would you like to meet, the taxi driver or the math student? Verb anaphora Ann sings in the shower. The hollering last half an hour. Coreference:  Coreference 'Coreference occurs when the same person, place, event, or concept is referenced more than once in a single document.' (Amit Bagga) Extension of Coreference:  Extension of Coreference Cross Document Coreference '... occurs when the same person, place, event, or concept is referenced more than once in multiple sources' (Amit Bagga) Essential Information Retrieval problem. Are two documents about the same things? Extension of Coreference:Images:  Extension of Coreference: Images Amid Bagga: Significant TV broadcasts are repeated across and within stations. Combining text and image recognition can aid detection of coreferent events. CBS 2829 CBS 3873 NBC 3885 NBC 5061 Applications:  Applications Information Extraction and Retrieval:  Information Extraction and Retrieval Q/A systems Q: Who was king of Norway 1985? A: #He was king of Norway 1985. Reference may find good keywords. Themes are often referred. Reference more than word form A simplistic example:  A simplistic example The lion is the king[+1] of the jungle. She[+2] hunts mostly at night. The females[+3] live in groups. The male[+4] is much larger, but _ [+5] lives alone. Word form: 'Lion' 1 of 26 words Reference: 'Lion' 6 of 26 words The significance of 'lion' increases. Machine Translation:  Machine Translation Det satt en katt-i på bordet-j. Heldigvis sto det-j stille. Heldigvis sto den-i stille. There was a cat on the table. Fortunately, it was standing still. Without co-reference: An unambiguous sentence becomes ambiguous. Important when translating between case, gender or aspect marking languages. Machine Translation:  Machine Translation The monkey ate the banana because ... 1) it was hungry hungry(it=monkey) 2) it was ripe ripe(it=banana) 3) it was tea time eat(Agent, Food, When=tea time) ProsodyText-to-Speech systems:  Prosody Text-to-Speech systems A given is seldom stressed (Horne andamp; Johansson 1991) I will never sell my dog. I LOVE the old mutt. If old mutt and my dog are coreferent, stress is more likely to move to some other (new) information. Disambiguation:  Disambiguation The lion roams the savannah. If there is no antecedent, assume definite np refers more generally (to the species). Cognitive Giveness Hierarchy A referent must be uniquely identifiable. There is only one species of lion. There are many individual lions Disambiguation:  Disambiguation Hunden knäckte benet med sina käkar. The dog broke the bone with his jaws. The dog broke the leg with his jaws. ? The dog broke his leg with his jaws. If 'benet' can be identified in a reference chain that identifies it, we are more likely to get the correct translation. Factors:  Factors Gender:  Gender Grammatical gender Det sto en katt-x på bordet-y. Den-x/det-y sto heldigvids stille. There was a cat on the table. Luckily, it kept still. Natural Gender Den nye eleven-x likte sykkelen-y. Han-x/hun-x/den-y var rask. The new student liked the bike. He/She/It was fast. Function of the antecedent:‘centering’:  Function of the antecedent: ‘centering’ Kari-S var sent ute, så hun-S ringte søsteren-O sin-S. Hun-S skrek inn i røret. Kari was late so she called her sister. She yelled into the blower. She most likely refers to Kari, as that choice keeps the focus on her. Determined Noun Phrase:  Determined Noun Phrase A cat and a dog were fighting outside. The dog howled like a wolf. Identify stems in scandinavian: hund / hunden. Look at that dog-i. I wouldn’t like to meet that beast-i alone. Compatible units through semantic network/world knowledge (ontologies). Negation:  Negation Ann didn’t see any woman. She was next door. Ann saw no woman. She was in another room. Ann didn’t talk to her. She was upset. 1) She = Ann 2) She = her Ann talked to her. She was upset. 1) She = her 2) She = Ann Ann talked to her, because she was upset. (?) Ann talked to her. She became upset. 1) She = her ? She = Ann Cause -andgt; Effect: Explanation:  Explanation The students protested, while the police observed. Then they attacked. The students protested, while the police observed. Then they began to throw stones. What do students do? And what do police do? Depends heavily on background knowledge We might extract some background knowledge from large collections of text, by observing statistical relations between subject and verb, and verb and object. Heavy NP:  Heavy NP A heavy NP is more likely to be referenced. Heavy NPs have more modifications. The small man with the black hat sat in the corner chatting with a clerk. He seemed relaxed. Embedded NPs are less likely to be referred than top level NPs. The clerk sat in the corner chatting with a small man with a black hat. He seemed relaxed. Interacts with other factors. Semantics:  Semantics Through part-whole relations It was a beautiful car. He sat behind the wheel. Through subordinate - superordinate He wants a dachshound, but I don’t know if he can take care of a dog. Through verb anaphora They captured Saddam last Sunday. The event was undramatic. More:  More Co-ordinated nps. The cat and dog were fighting. They got hurt. ? I saw a cat with one eye last night. It was horrible. The cat (?) The eye (?) Last night (?) The sight / situation (?) Challenges:  Challenges Noisy data underlying decisions Word class tags 95 - 98 % correct Functional roles, maybe 80% correct Spelling errors etc. Challenges:  Challenges Common knowledge is important, but difficult to model. Semantic Networks Ontologies: what exists in the world Dynamic Models: change over time Situational Semantics Fuzzy Logic Explanation Driven Processing. Challenges:  Challenges Highly ambiguous It is often not clear to humans exactly what is referred. We have place holding pronouns It rains. We have general reference, where possibly more than one thing is referred to some degree. Challenges: Fuzziness:  Challenges: Fuzziness C A R In Fuzzy Logic we can say that the car is 0.70 in parking pocket A and 0.30 in parking pocket B. This is not the same as saying it is in A with 0.70 probability. (it would then be either in A or somewhere else.) Similarly, reference might be to more than one thing to some degree, simultaneously. Developing a program:  Developing a program A Classification task:  A Classification task Machine Learning Decide yes/no for coreference. Soon, Ng, andamp; Lim, 2001. A Machine Learning Approach to Coreference Resolution of Noun Phrases. Computational Linguistics, Vol. 27(4). Preprocessing: Tilburg Memory Based Learner: http://pi0657.uvt.nl/:  Preprocessing: Tilburg Memory Based Learner: http://pi0657.uvt.nl/ Input: Now is a tough time to be a computer maker. 1) tagging, 2) chunking, 3) functional role detection: [NP1Subject Now/RB ] [VP1 is/VBZ ] [NP1NP-PRD a/DT tough/JJ time/NN ] [VP2 to/TO be/VB ] [NP2NP-PRD a/DT computer/NN maker/NN ] An example of realistic input:  An example of realistic input [NP1Subject Sun/NNP Microsystems/NNPS ] ,/, [P along/IN ] {PNP [P with/IN ] [NP its/PRP$ rivals/NNS] } ,/, [VP1 has/VBZ had/VBD to/TO go/VB to/TO ] ``/`` [NP1Object warp/NN speed/NN ] and/CC [VP2 then/RB back/VB ] '/UNKNOWN ,/, [NP3Subject Scott/NNP McNealy/NNP ] ,/, [NP4Subject its/PRP$ chief/JJ executive/NN ] ,/, [VP3 said/VBD ] [NP3NP-TMP last/JJ week/NN ] ,/, [C as/IN ] [NP4Subject Sun/NNP ] [VP4 announced/VBD ] [C that/IN ] [NP5Subject it/PRP ] [VP5 would/MD make/VB ] [NP5Object a/DT larger-than-expected/JJ loss/NN ] {PNP [P in/IN ] [NP the/DT current/JJ quarter/NN ] } and/CC [VP6 would/MD lay/VB ] [PRT off/RP ] [NP6Object 3,900/CD workers/NNS ] ./. Machine Learning :  Machine Learning Train a match function for deciding the anaphor-antecedent relation. TiMBL Easy to expand the model when more data is available. Machine Learning: training :  Machine Learning: training We start from a large collection of examples. For each anaphor construct a match vector for each candidate mark the vector for antecedent (yes/no) The match is calculated for 9 features: string, lemma, suffix (form) subject, object, complement (of the same verb) same functional role grammatical gender number Machine Learning: testing :  Machine Learning: testing Construct match vectors for the nearest 40 candidates. Check the outcome with the large database For example, for the first candidate the nearest neighbor has 4 matching features. Collect from the database all exemplars with 3 or 4 matching features. Outcome: 90 no 10 yes Machine Learning: testing :  Machine Learning: testing Repeat for the 40 candidates. Outcome 1: 90 no 10 yes Outcome 2: 43 no 5 yes ... outcome 40: 120 no 10 yes How to decide for yes or no? Machine Learning: testing :  Machine Learning: testing Repeat for the 40 candidates. Outcome 1: 90 no 10 yes Outcome 2: 43 no 5 yes ... outcome 40: 120 no 10 yes How to decide for yes or no? We have decide that the most extreme is probably the best. We have to calculate the expected values for yes / no from the training set. Score = (Observedyes - Expectedyes) / std.devyes - (Observedno - Expectedno) / std.devno Conclusion 1(3) :  Conclusion 1(3) Coreference is a very general problem in natural language processing. It also extends into related domains for coreference of images. Establishing coreference has many applications: MT, IR, T2S, etc. Coreference is also a phenomena with inherent difficulties. Coreference might be vague and/or ambiguous. Coreference often depends heavily on background knowledge, which can be difficult to capture in a formal model. Conclusion 2(3) :  Conclusion 2(3) Using Machine Learning to adapt coreference with a textual domain gives us a general method to handle the problem. One problem of Machine Learning is to find the relevant features. Another problem is that the features we use often interact with each other. Vagueness and ambiguity often makes it impossible to select only one candidate for co-reference. Conclusion 3 :  Conclusion 3 Vagueness and ambiguity often makes it impossible to select only one candidate for co-reference. There is certainly a problem with evaluation, as some mistakes are more serious than others. Machine Learning of co-reference is still a young research field. Much work is needed, and many good ideas are certain to emerge. Thank you for listening:  Thank you for listening http://ling.uib.no/BREDT/ Christer.Johansson@lili.uib.no Lars.Johnsen@lili.uib.no Kaja.Borthen@hf.ntnu.no

Add a comment

Related presentations

Related pages

Chalmers University of Technology | Chalmers

Student Union Chalmers Student Union (Chalmers Studentkår) is a strong, independent organisation with around 10,000 members. The union has ...
Read more

David Chalmers – Wikipedia

David Chalmers (* 20. April 1966 in Sydney, Australien) ist ein australischer Philosoph. Seine Hauptarbeitsgebiete liegen im Bereich der Sprachphilosophie ...
Read more

Chalmers – Wikipedia

Chalmers ist der Familienname folgender Personen: Alan Francis Chalmers (* 1939), Philosoph und Wissenschaftstheoretiker; Alexander Chalmers (1759–1834 ...
Read more

Chalmers University of Technology - Wikipedia, the free ...

Chalmers University of Technology; Chalmers tekniska högskola: Motto: Avancez ... Advance: Type: Private Technical: Established: 1829: President: Stefan ...
Read more

Mario Chalmers - Wikipedia, the free encyclopedia

Almario Vernard "Mario" Chalmers (born May 19, 1986) is an American professional basketball player who is a free agent. He last played for the Memphis ...
Read more

David Chalmers Website - Consc

David Chalmers . I am a philosopher at New York University and the Australian National University. Officially I am Professor of Philosophy and co-director ...
Read more

The Student Portal at Chalmers University of Technology ...

On the Student Portal you will find information about studies at Chalmers and all the services that facilitate your studies. Here you can find information ...
Read more

Estate Agents Architects East Lothian - Ogilvy Chalmers

Ogilvy Chalmers is a team of property professionals in Haddington, East Lothian - blends expertise of estate agents, letting agents, land agents and ...
Read more

Chalmers tekniska högskola – Wikipedia

Chalmers tekniska högskola, ofta bara Chalmers, tidigare även CTH, är en teknisk högskola i Göteborg, som sedan 1937 har haft ställning som universitet.
Read more

Chalmers Jewelers | Engagement RIngs | Where Madison Gets ...

Custom Design CAD Studio, Certified Watch Repair Shop, Graduate Gemologist Appraisals, Four Master-Goldsmiths. Largest loose diamond selection in Wisconsin.
Read more