emnlp03

60 %
40 %
Information about emnlp03
Education

Published on January 11, 2008

Author: Davidino

Source: authorstream.com

Learning Extraction Patterns for Subjective Expressions:  Learning Extraction Patterns for Subjective Expressions Ellen Riloff Janyce Wiebe University of Utah University of Pittsburgh Subjectivity:  Subjectivity Subjective language includes opinions, speculations, emotions Distinguishing subjective and objective information could benefit many applications: Information extraction (discard subjective information or label it as uncertain) Question answering (find answers reflecting different opinions) Summarization (summarize various views on topic) Goals:  Goals Sentence-level subjectivity classification Wiebe et al. 2001 found that 44% of sentences in news articles are subjective * * Goals:  Goals Sentence-level subjectivity classification Learning subjectivity clues * Goals:  Goals Sentence-level subjectivity classification Learning subjectivity clues from unannotated text * Goals:  Goals Sentence-level subjectivity classification Learning subjectivity clues from unannotated text corpora Learning linguistically rich patterns (represented as IE extraction patterns) Previous Work in NLP Subjectivity Analysis in Text:  Previous Work in NLP Subjectivity Analysis in Text Document-level subjectivity classification (e.g., Turney 2002; Pang et al 2002; Spertus 1997) and above (Tong 2001) Genre classification (e.g., Karlgren and Cutting 1994; Kessler et al. 1997; Wiebe et al. 2001) Supervised sentence-level classification (Wiebe et al 1999) Learning adjectives, adjectival phrases, verbs, nouns, and N-grams (e.g., Turney 2002; Hatzivassiloglou & McKeown 1997; Wiebe et al. 2001; Riloff et al. 2003) Recent Related Work:  Recent Related Work Yu and Hatzivassiloglou (EMNLP03): unsupervised sentence level classification. Complementary approach and features. Dave et al. (WWW03): reviews classified as positive or negative. Agrawal et al. (WWW03): newsgroup authors partitioned into camps based on quotation links Gordon et al. (ACL03): manually developed grammars for some types of subjective language Extraction Patterns:  Extraction Patterns Extraction patterns are lexico-syntactic patterns to identify relevant information Typically they represent role relationships surrounding noun and verb phrases Extraction Patterns:  Extraction Patterns Extraction patterns are lexico-syntactic patterns to identify relevant information Typically they represent role relationships surrounding noun and verb phrases hijacking of <x>: hijacked vehicle <x> was hijacked: hijacked vehicle Extraction Patterns:  Extraction Patterns Extraction patterns are lexico-syntactic patterns to identify relevant information Typically they represent role relationships surrounding noun and verb phrases hijacking of <x>: hijacked vehicle <x> was hijacked: hijacked vehicle <x> hijacked: hijacker Our Method:  Our Method Subjective expressions represented as extraction patterns get to know <dobj> <subj> appear to be <subj> was satisfied <subj> complained Subtle variations can be significant: “The comedian bombed last night.” Often higher precision than sub-expressions More general than fixed n-grams Our Method:  Our Method Subjective expressions represented as extraction patterns get to know <dobj> <subj> appear to be <subj> was satisfied <subj> complained Supervised extraction pattern learning Training data generated automatically Our Method:  Our Method Subjective expressions represented as extraction patterns get to know <dobj> <subj> appear to be <subj> was satisfied <subj> complained Supervised extraction pattern learning Training data generated automatically Entire process bootstrapped Slide15:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide16:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide17:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide18:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide19:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide20:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Results For 1 cycle Test Data:  Test Data Manual annotation for multiple perspective QA (ARDA AQUAINT NRRC) (working on copyright issues to release corpus this summer) Good agreement on sentence classes used here 0.77 ave pair-wise kappa 0.89 ave pair-wise kappa with borderline sentences removed (11% of the corpus) Wilson & Wiebe SIGdial 2003 describes the annotation scheme and agreement study Slide22:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide23:  Unannotated Text Collection English language versions of FBIS news articles from a variety of countries. Size: 302,160 sentences Slide24:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide25:  Known subjective vocabulary From previous work Manually identified (e.g, entries from Levin 1993) Automatically identified (e.g., nouns from Riloff et al CoNLL03) Slide26:  Known subjective vocabulary From previous work Manually identified (e.g, entries from Levin 1993) Automatically identified (e.g., nouns from Riloff et al. 2003) Strongly subjective: most instances subjective Weakly subjective: objective instances also common Slide27:  Known subjective vocabulary From previous work Manually identified (e.g, entries from Levin 1993) Automatically identified (e.g., nouns from Riloff et al. 2003) Strongly subjective: most instances subjective Weakly subjective: objective instances also common Any data used is separate from data in this paper Slide28:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide29:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective >1 strongly subjective Classifier clue unlabeled sentences subjective sentences Objective Classifier objective sentences 91.3% Precision 31.9% Recall Test set: 2197 sentences 59% subjective Slide30:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective 2+ strongly subjective Classifier clues unlabeled sentences Objective previous, current, next sentence: Classifier 0 strongly subjective clue & 0 or 1 weakly subjective clue 82.6% Precision 16.4% Recall objective sentences Slide31:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide32:  Subjective Classifier Extraction Pattern AutoSlog-TS Learner Riloff 1996 Objective Classifier subjective patterns subjective sentences “relevant texts” 17,000 objective sentences “irrelevant texts” 17,000 Step 1: Apply Syntactic Templates:  Step 1: Apply Syntactic Templates <subj>active-verb dobj <subj> verb infinitive <subj> aux noun Active-verb <dobj> Verb infinitive <dobj> Noun prep <np> Infinitive prep <np> Step 1: Apply Syntactic Templates:  Step 1: Apply Syntactic Templates <subj>active-verb dobj <subj> dealt blow <subj> verb infinitive <subj> appear to be <subj> aux noun <subj> has position Active-verb <dobj> endorsed <dobj> Verb infinitive <dobj> get to know <dobj> Noun prep <np> opinion on <np> Infinitive prep <np> to resort to <np> Step 1: Apply Syntactic Templates:  Step 1: Apply Syntactic Templates <subj>active-verb dobj <subj> dealt blow <subj> verb infinitive <subj> appear to be <subj> aux noun <subj> has position Active-verb <dobj> endorsed <dobj> Verb infinitive <dobj> get to know <dobj> Noun prep <np> opinion on <np> Infinitive prep <np> to resort to <np> Step 1: Apply Syntactic Templates:  Step 1: Apply Syntactic Templates <subj>active-verb dobj <subj> dealt blow Matches any sentence with verb phrase with head=dealt direct object with head=blow. “The experience certainly dealt a stiff blow to his pride.” Step 2: Select Patterns:  Step 2: Select Patterns Apply all learned patterns to training data Calculate precision and frequency: precision(pattern) = # in subjective sentences / total # Select patterns based on their frequency and precision on the training data (No tuning on the test set) Examples from Training Data:  Examples from Training Data %SUBJ Examples from Training Data:  Examples from Training Data %SUBJ Examples from Training Data:  Examples from Training Data %SUBJ Examples from Training Data:  Examples from Training Data %SUBJ Slide42:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Evaluation of Learned Patterns:  Evaluation of Learned Patterns Test data: 3947 sentences 54% subjective Train Test F >= 10 P=100% P = 85% Recall=41% F >= 2 P >= 60% P = 71% Recall=92% Slide44:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide45:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide46:  unlabeled sentences Known subjective vocabulary Subjective Classifier subjective sentences Extraction Pattern Learner subjective patterns Slide47:  unlabeled sentences Known subjective vocabulary Subjective Classifier New subjective sentences: 1 old clue + 1 new >1 new old + new subjective sentences Extraction Pattern Learner F >= 10, P = 100% on training data subjective patterns Evaluation on Test Data:  Evaluation on Test Data Original subjective classifier Augmented subjective classifier 40.1% recall 32.9% recall 90.2% precision 91.3% precision Future Work:  Future Work Slide50:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide51:  Known subjective vocabulary Pattern-Based Objective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences objective sentences objective sentences Improve original high-precision classifier identify new objective sentences during bootstrapping Slide52:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide53:  Unannotated Text Collection unlabeled sentences Subjective Classifier Iteration 0 Iteration 1+ Objective Classifier Iteration 0 Iteration 1+ Known subjective vocabulary Iteration 0: use corpus-independent subjectivity clues to generate initial training set Iteration 1+: supervised learning algorithm to tune to corpus and combine old and new clues effectively Slide54:  Known subjective vocabulary Build up subjective lexicon as the process is applied to additional corpora Once bootstrapping process terminates, human review of high precision patterns tough act to follow: linguistic subjectivity Rush Limbaugh: opinionated source police: “lightning rod” topic Conclusions:  Conclusions High-precision subjectivity classification can be used to generate large amounts of labeled training data Extraction pattern learning techniques can learn linguistically rich subjective patterns Bootstrapping process results in higher recall with little loss in precision Slide56:  Known subjective vocabulary Build up subjective lexicon as the process is applied to new corpora. Richer Representation with deeper knowledge (theta roles, polarity, evaluative?, speculative?, tone, ambiguity,…) Human review of high-precision patterns tough act to follow: linguistic subjectivity Rush Limbaugh: opinionated source police: “lightning rod” topic Slide57:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide58:  Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns objective sentences 17000 17000 new subjective sentences Slide59:  Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier > 0 instances of patterns with F >4 P = 100 on training data Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns objective sentences 17000 17000 subjective sentences 9500 new Slide60:  Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective sentences objective sentences 17000 17000 7500 9500 new 4248 new patterns P > .59 on training data 308 new patterns P = 100 on training data Slide61:  Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective sentences objective sentences 17000 17000 7500 9500 new New + old patterns on test set: Recall increased more than precision decreased +2R, -0.5P to +4R, -2P Example:  Example The Foreign Ministry said Thursday that it was “surprised, to put it mildly” by the U.S. State Department’s criticism of Russia’s human rights record and objected in particular to the “odious” section on Chechnya. Annotation Scheme:  Annotation Scheme The annotation scheme was developed as part of a U.S. government-sponsored project (ARDA AQUAINT NRRC) to investigate multiple perspective question answering. Annotators labeled private state expressions. Each private state can have low, medium, or high strength. Our gold standard considers a sentence to be subjective if it contains at least one private state expression of medium or higher strength. Two Ways of Expressing Private States:  Two Ways of Expressing Private States Explicit mentions of private states and speech events The United States fears a spill-over from the anti-terrorist campaign Expressive subjective elements The part of the US human rights report about China is full of absurdities and fabrications. Nested Sources:  Nested Sources OnlyFactive:  OnlyFactive “The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities. OnlyFactive=yes Example:  Example The Foreign Ministry said Thursday that it was “surprised, to put it mildly” by the U.S. State Department’s criticism of Russia’s human rights record and objected in particular to the “odious” section on Chechnya. Slide69:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences Slide70:  Unannotated Text Collection unlabeled sentences Known subjective vocabulary Subjective Classifier unlabeled sentences subjective sentences Pattern-Based Subjective Classifier Extraction Pattern Learner Objective Classifier unlabeled sentences subjective patterns subjective sentences subjective patterns objective sentences

Add a comment

Related presentations

Related pages

emnlp03 - Ace Recommendation Platform - 1

Bootstrapping Coreference Classifiers withMultiple Machine Learning AlgorithmsVincent Ng and Claire CardieDepartment of Computer ScienceCornell ...
Read more

EMNLP 2003 - People | MIT CSAIL

2003 Conference on Empirical Methods in Natural Language Processing (EMNLP 2003) Call for Papers. SIGDAT, the Association for Computational Linguistics ...
Read more

emnlp03 - Ace Recommendation Platform - 1

Learning Extraction Patterns for Subjective Expressions∗Ellen RiloffSchool of ComputingUniversity of UtahSalt Lake City, UT 84112riloff@cs.utah ...
Read more

emnlp03 - Ace Recommendation Platform - 1

Sentence Alignment for Monolingual Comparable CorporaRegina BarzilayDepartment of Computer ScienceCornell UniversityIthaca, NY 14853regina@cs.cornell ...
Read more

www.cs.rochester.edu

@InProceedings{Gildea-emnlp03, author = {Daniel Gildea and Julia Hockenmaier}, title = {Identifying Semantic Roles Using {Combinatory} {Categorial} ...
Read more

EMNLP 2003 - Massachusetts Institute of Technology

2003 Conference on Empirical Methods in Natural Language Processing (EMNLP 2003) Sapporo, Japan July 11-12, 2003 by SIGDAT and the Association for ...
Read more

Learning Extraction Patterns for Subjective Expressions

In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing (EMNLP-03) Learning Extraction Patterns for Subjective Expressions
Read more

Bootstrapping CoreferenceClassifiers with Multiple Machine ...

Bootstrapping CoreferenceClassifiers with Multiple Machine Learning Algorithms Vincent Ng and Claire Cardie Department of Computer Science Cornell University
Read more

Bootstrapping Coreference Classifiers with Multiple Machine ...

Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing (EMNLP), Sapporo, Japan, July 2003, pp. 113-120. Bootstrapping ...
Read more

Sentence Alignment for Monolingual Comparable Corpora

Sentence Alignment for Monolingual Comparable Corpora Regina Barzilay Department of Computer Science Cornell University Ithaca, NY 14853 regina@cs.cornell.edu
Read more