Lecture3 NLP grammars parsing

50 %
50 %
Information about Lecture3 NLP grammars parsing
Entertainment

Published on October 21, 2007

Author: FunnyGuy

Source: authorstream.com

Parsing Natural Languages with Context-free Grammars :  Parsing Natural Languages with Context-free Grammars Martin Volk Computational Linguistics Stockholm University volk@ling.su.se The Chomsky Hierarchy:  The Chomsky Hierarchy The Chomsky Hierarchy:  The Chomsky Hierarchy states restrictions on rules. Given: A, B are non-terminals. x is a string of terminals. ,, are arbitrary strings (of terminals and non-terminals). then each rule is of the form: Type 3: A  xB or A  x Type 2: A   Type 1:  A      where  is not empty Type 0: left side of the rule is not empty Context-free grammars:  Context-free grammars (may) have rules like NP  Det N PP  Prep NP cannot have rules like NP PP  PP NP ADV anfangen  fangen ADV an This restriction has implications for the processing resources and speed. Issues:  Issues Why do computational linguists use formal grammars for describing natural languages? Are natural languages context-free languages? Are there grammar formalisms that linguists prefer?  ID/LP-grammars The goal of Natural Language Processing (NLP):  The goal of Natural Language Processing (NLP) Given a natural language utterance (written or spoken): Determine: who did what to whom, when, where, how, why (for what reasons, for what purpose)? Towards this goal: Determine the syntactic structure of an utterance. Steps to syntax analysis:  Steps to syntax analysis For every word in the input string determine its word class. Group all words into constituents. Determine the linguistic functions (subject, object, etc.) of the constituents. Determine the logical functions (agent, recipient, transfered-object, place, time …) An example:  An example A book was given to Mary by Peter. det noun aux verb prep name prep name An example:  An example A book was given to Mary by Peter. det noun aux verb prep name prep name noun phrase prep phrase verb group prep phrase An example:  An example A book was given to Mary by Peter. det noun aux verb prep name prep name noun phrase prep phrase verb group prep phrase verb phrase An example:  An example A book was given to Mary by Peter. det noun aux verb prep name prep name noun phrase prep phrase verb group prep phrase verb phrase sentence An example:  An example A book was given to Mary by Peter. det noun aux verb prep name prep name noun phrase prep phrase verb group prep phrase verb phrase passive sentence Logical subject Logical object Result:  Result Agent (the giver): Peter The object: a book Recipient: Mary Action: giving When: in the past Via inference Who has a book now? Mary The context-free rules of a natural language grammar:  The context-free rules of a natural language grammar Noun_Phrase  Determiner Noun a book the house some houses 50 books Peter’s house The context-free rules of a natural language grammar:  The context-free rules of a natural language grammar Adjective_Phrase  Adjective Adjective_Phrase  Adverb Adjective nice nicest very nice hardly finished The context-free rules of a natural language grammar:  The context-free rules of a natural language grammar Noun_Phrase  Det Adjective_Phrase Noun a nice book the old house some very old houses 50 green books The context-free rules of a natural language grammar:  The context-free rules of a natural language grammar Prep_Phrase  Preposition Noun_Phrase with a nice book through the old house in some very old houses for 50 green books The context-free rules of a natural language grammar:  The context-free rules of a natural language grammar (may) include recursion (direct and indirect) Examples NP  NP PP # the bridge over the Nile NP  NP Srelative # a student who likes this course Srelative  NP VP # who likes this course Slide19:  a student who likes this course det noun rel-pron verb det noun NP NP NP VP Srel NP Formal Definition of a Context-free Grammar:  Formal Definition of a Context-free Grammar A context-free grammar consists of a set of non-terminal symbols N set of terminals  a set of productions A →  A N, -string  (N)* a designated start symbol (from N) Context-free grammars for natural language:  Context-free grammars for natural language A set of non-terminal symbols N word class symbols (N, V, Adj, Adv, P) linguistic constituent symbols (NP, VP, AdjP, AdvP, PP) A set of terminals  all words of the English language A set of productions A →  the grammar rules (e.g. NP  Det, AdjP, N) A designated start symbol a symbol for the complete sentence How many …?:  How many …? … non-terminals do we need? word class symbols (N, V, Adj, Adv, P) usually between 20 and 50 linguistic constituent symbols (NP, VP, …) usually between 10 and 20 … terminals do we need? words of the English language? Different word stems (see, walk, give) > 50´000 Different word forms (see, sees, saw, seen) > 100´000 How many …?:  How many …? … grammar rules do we need? NP  Name # Mary, Peter NP  Det Noun # a book PP  Prep NP # to Mary VP  V NP PP # gave a book to Mary VP  V NP NP # gave Mary a book Problem: This grammar will also accept: *Peter give Mary a books. # agreement problem *Peter sees Mary a book. # complement problem Agreement: Why bother?:  Agreement: Why bother? *Peter give Mary a books. Consider: Peter threw the books into the garbage can that are old and grey. Peter threw the books into the garbage can that is old and grey. Agreement can help us determine the intended meaning! Agreement: First approach:  Agreement: First approach NPsg  Namesg # Mary, Peter NPsg  Detsg Nounsg # a book NPpl  Detpl Nounpl # the books PP  Prep NPsg # to Mary PP  Prep NPpl # for the books VP  V NPsg NPsg # gave Mary a book VP  V NPsg NPpl # gave Mary the books VP  V NPpl NPsg # gave the kids a book VP  V NPpl NPpl # gave the kids the books Combinatorial explosion … too many rules Agreement: Better approach:  Agreement: Better approach Variables ensure agreement via feature unification. NP[Num]  Name[Num] # Mary, Peter NP[Num]  Det[Num] Noun[Num] # a book, the books PP  Prep NP[X] # to Mary, for the books VP[Num]  V[Num] NP[X] NP[Y] # give Mary a book; gives Mary the books Subcategorization:  Subcategorization Verbs have preferences for the kinds of constituents they co-occur with. For example: VP → Verb (disappear) VP → Verb NP (prefer a morning flight) VP → Verb NP PP (leave Boston in the morning) VP → Verb PP (leaving on Thursday) But not: *I disappeared the cat. Parsing as Search:  Parsing as Search Top-down Parsing Bottom-up Parsing  see Jurafsky slides That sounds nice …:  That sounds nice … … where is the problem? from the Financial Times of Nov. 23. 2004 at http://news.ft.com/home/europe:  from the Financial Times of Nov. 23. 2004 at http://news.ft.com/home/europe McDonald’s CEO steps down to battle cancer By Neil Buckley in New York Published: November 23 2004 00:51 Last updated: November 23 2004 00:51 McDonald's said on Monday night Charlie Bell would step down as chief executive to devote his time to battling colorectal cancer, dealing another blow to the world's largest fast food company. Mr Bell's resignation comes just seven months after James Cantalupo, its former chairman and chief executive, died from a heart attack. McDonald's moved quickly to close the gap, appointing Jim Skinner, currently vice-chairman, to the chief executive's role. Problems when parsing natural language sentences:  Problems when parsing natural language sentences Words that are (perhaps) not in the lexicon. Proper names James Cantalupo, McDonald's, InterContinental, GE Compounded words  need to be segmented kurskamrater, kurslitteratur, kursavsnitt, kursplaneundersökningarna, kursförluster valutakurs, snabbkurs, säljkurser aktiekurser, valutakursindex Foreign language expressions Don Kerr är Mellanösternspecialist på The International Institute for Strategic Studies i London , högt ansedd , oberoende thinktank . Multiword expressions Idioms: to deal another blow Metaphors to battle cancer Problems when parsing natural language sentences:  Problems when parsing natural language sentences Ambiguities Word level (kurs as in valutakurs or kurskamrat) Sentence level He sees the man with the telescope. Old men and women left the occupied city. Additional knowledge sources are needed to resolve ambiguities More world knowledge Statistical knowledge (Parsing preferences) How can we obtain statistical preferences?:  How can we obtain statistical preferences? From a parsed and manually checked corpus (= collection of sentences) Such a corpus is usually a database that contains the correct syntax tree with each sentence (therefore called a treebank). Building a treebank is very time-consuming. Slide34:  Can all the syntax of natural language be described with context-free rules? Are there phenomena in natural language that require context-sensitive rules? Limits of Context-free Grammars:  Limits of Context-free Grammars It is not possible to write a context-free grammar (or to design a Push-Down Automaton (PDA)) for the language L = {anbnan | n > 0} Why? Intuitively: The memory component of a PDA works like a stack. One stack! So, it can only be used to count once. Are natural languages context-free?:  Are natural languages context-free? Yes! But … there is a famous paper about some constructions in Swiss German of the form w an bm x cn dm y Jan säit, das mer (em Hans) (es huus) (hälfed) (aastriiche). Jan säit, das mer (d´chind)n (em Hans)m (es huus) (haend wele laa)n (hälfe)m (aastriiche). but they are rather strange and rare. The claim that they are not context-free relies on the assumption that n and m are unbounded. The notion of ”context”:  The notion of ”context” We need ”context” to understand a natural language utterance! This notion of ”context” is different from the notion of ”context” in the name context-free languages. Slide38:  Do linguists like context-free grammars? Not really … Linguists want …:  Linguists want … to express grammar rules on different abstract levels. For example: Instead of saying: NP  NP Conj NP # the boy and the girl VP  VP Conj VP # sang and danced AdjP  AdjP Conj AdjP # wise and very famous they would like to say: XP  XP Conj XP Linguists want …:  Linguists want … (to be able) to state dominance and precedence separately. Peter dropped the course happily. Happily Peter dropped the course. S  Adv S’ S  S’ Adv Context-free Grammars:  Context-free Grammars Context-free grammar rules encode both Dominance and Precedence information Example: A B C D A dominates B and C and D and B precedes C which in turn precedes D ID/LP-Grammars:  ID/LP-Grammars ID/LP-Grammars have separate rules: ID (Immediate dominance) rules and LP (Linear precedence) rules. Example: ID-rule: A {B, C, D} A dominates B and C and D LP-rule: B < C B precedes C ID/LP Grammars have been proposed in Linguistics, e.g. in Generalized Phrase Structure Grammar (GPSG; by Gazdar, Klein, Pullum, Sag, 1985) ID/LP-Grammars:  ID/LP-Grammars Example from German Gestern hat [VP der Professor der Sekretärin diese Blumen geschenkt]. Gestern hat [VP der Professor diese Blumen der Sekretärin geschenkt]. Gestern hat [VP diese Blumen der Professor der Sekretärin geschenkt]. Gestern hat [VP diese Blumen der Sekretärin der Professor geschenkt]. Gestern hat [VP der Sekretärin der Professor diese Blumen geschenkt]. Gestern hat [VP der Sekretärin diese Blumen der Professor geschenkt]. ID/LP-Grammars:  ID/LP-Grammars The German verb phrase (or Mittelfeld) consists of an NP_nominative an NP_dative an NP_accusative a verb To account for all order variations will require 6 context-free grammar rules, but it requires only one ID-rule plus one LP-rule: VP  {NP_accusative, NP_dative, NP_nominative, V} NP < V ID/LP-Grammars vs. Context-free Grammars:  ID/LP-Grammars vs. Context-free Grammars All ID/LP-grammars can be transformed into strongly equivalent context-free grammars. Some cf grammars cannot be transformed into strongly equivalent ID/LP grammars. Example: The cf grammar consisting of the rule A aca cannot be transformed into a strongly equivalent ID/LP grammar, because of contradictiory ordering constraints: a before c AND c before a An additional non-terminal is required: ID-rules: A {Z,a} Z  {a,c} LP-rules: Z < a a < c Summary:  Summary Why do computational linguists use formal grammars for describing natural languages? As an intermediate step to capture the meaning of natural language utterances. Are natural languages context-free languages? The syntax of natural languages can be described with context-free grammars (in general). What grammar formalisms do linguistics prefer? Linguists want to describe natural language as precise and as comfortable as possible. They prefer grammar formalisms with feature variables, metarules, ID/LP separation, schemata, abstract rules … Any Questions?:  Any Questions?

Add a comment

Related presentations

Related pages

PowerPoint Presentation - Rensselaer Polytechnic Institute

DEPENDENCY PARSING,Framenet,SEMANTIC ROLE LABELING, ... Relations to Other NLP Tasks:Syntactic Parsing. ... Parsing done using grammars.
Read more

QUERY EXPANSION, RELEVANCE FEEDBACK, POS TAGGING AND ...

SYNTACTIC PARSING Heng Ji jih@rpi.edu February 19, 2014. Outline • Query Expansion and Relevance Feedback • POS Tagging and HMM • Formal Grammars
Read more

nlp ppt - printernet17.aq.pl

Lecture3 nlp grammars parsing. By: ... Oraz kursów Praktyk nlp oraz Mistrz Praktyk nlp w Polskim Instytucie nlp.. Need to examine documents in person, ...
Read more

Lecture 3: Structures and Decoding - CMU Computer Science

Outline 1. Structures in NLP 2. HMMs as BNs – Viterbi algorithm as variable eliminaon 3. Linear models
Read more

C SC 620: Spring '04 - University of Arizona

C SC 620 Advanced Topics in ... (NLP) is a broad and exciting field at the intersection of computer science, ... , grammar formalisms and parsing algorithms.
Read more

lecture3 - Ace Recommendation Platform - 1

lecture3. We found 20 results related to this asset. Document Information; Type: Lecture Notes; Total # of pages: 252. Avg Rating: Price: ...
Read more

Parsing With Compositional Vector Grammars In ACL R Socher ...

Parsing With Compositional Vector Grammars. In ACL . R. Socher, M. Ganjoo, C. D. Manning, and A. Y. Ng. 2013b. Zero-Shot Learning Through Cross-Modal Transfer.
Read more

CS 224N : Natural Language Processing - Stanford - Course Hero

Here is the best resource for homework help with CS 224N : Natural Language Processing at Stanford. Find CS224N study guides, notes, and practice tests from
Read more