Published on February 20, 2014
Lexical Simplification Improving understandability, Reducing Errors. Matt Shardlow University of Manchester lexicalsimplification.blogspot.co.uk
The Problem ● Text is everywhere ● But is it understandable? ● Simplification as assistive technology. ● Concentrating on vocabulary 2
Examples ● Technical Medical Language – – ● Hypertension risk factors include obesity,... High blood pressure risk factors include excessive weight,... Legal Language – – ● The Products transacted through the Service are... The Products managed through the Service are... Low Literacy Readers – The plan resolved to increase the frequency of flooding... – The plan determined to raise the amount of flooding... 3
How? ● Practical Simplification of English Text (PSET) – – Replaces low frequency words with high frequency – ● Lexical simplification (Carroll et al. '98) Paraphrasing – Medical language (Deléger and Zweigenbaum '09) – Wikipedia (Yatskar et al. '10) 4
How? ● Word Sense Disambiguation – – ● Language modelling (De Belder and Moens '10) WordNet (Thomas and Anderson '12) Ranking Task – SemEval-2012: Task 1 (Specia et al. '12) – Shared data and evaluation measure. 5
The Pipeline Identification of Complex Words … the frequency of floods Generation of Substitutions Frequency: incidence, amount, recurrence, repetition, prevalence, Word Sense Disambiguation Frequency: incidence, amount, recurrence, repetition, prevalence, Ranking Synonyms by Simplicity 1) amount 2) repetition 3) recurrence 6
Errors in the pipeline ● The event's legacy “hangs in the balance”. ● The event's legacy “falls ● … a vote paving ● … a vote pavement the way for military action. ● … the understanding of neurological disorders. ● … the understanding of neurological disorders. in the balance”. the way for military action. 7
An Experiment ● Baseline simplification system ● Corpus of news texts ● Error categories 8
Error Categories ● Type 1: No error ● Type 2A: An undetected complex word ● Type 2B: A simple word identified as complex ● Type 3A: No substitutions available ● Type 3B: No simpler substitutions available ● Type 4: A change in meaning ● Type 5: The resulting text is more difficult 9
The Pipeline (2) Identification of Complex Words Type 2A / 2B Generation of Substitutions Type 3A / 3B Word Sense Disambiguation Type 4 Ranking Synonyms by Simplicity Type 5 10
Annotation ● Verbose system output ● Single annotator ● One sitting ● Annotations – Recorded – Cross validated 11
Error Distribution - Raw 12
Error Distribution – Independence 13
Where to? ● Working to mitigate the errors ● Evaluation for each pipeline step ● Domain adaptability ● Personal Integration 14
Any Questions References ● J. Carrol, G. Minnen, Y. Canning, S. Devlin, and J. Tait. Practical simplif cation of english newspaper text to assist aphasic readers. AAAI i 1998. ● L. Deléger and P. Zweigenbaum. Extracting lay paraphrases of specialized expressions from monolingual comparable medical corpora. BUCC 2009. ● J. De Belder and M. Moens. Text simplif cation for children. In Proceedings i of the SIGIR workshop on accessible search systems, 2010. ● S. R. Thomas and S. Anderson. WordNet-based lexical simplif cation of a i document. KONVENS 2012. ● L. Specia, S. K. Jauhar, and R. Mihalcea. Semeval-2012 Task 1: English lexical simplif cation. SemEval, 2012. i 15
IAA ● Post-hoc Inter Annotator study ● 3 Annotators ● 20 sentences ● Same format of annotation as first experiment ● Fleiss' kappa calculated ● Moderate kappa agreement of 0.3556 16
... Finding and Categorising Errors in the Lexical Simpliﬁcation ... improving the comprehensibility of ... amount of work towards reducing errors due to ...
Lexical Simplification. Improving Understandability ... All about my research. What is lexical simplification? What do the errors in the le ...
RESEARCH EFFORTS IN LEXICAL SIMPLIFICATION ... lexical simpliﬁcation by improving the ... whilst improving readability and understandability.
CodePro Analytix User Guide ... concerned about improving software quality and reducing ... developer in reducing errors as the code ...
IMPROVING THE UNDERSTANDABILITY OF SPEECH SYNTHESIS ... on understandability, including lexical choice, ... ing some errors, ...
... lexical resources for selecting synonyms and strategies for word sense disambiguation in a lexical simplification ... for reducing the ...
text simplification bibliography, ... Improving Text Simplification Language Modeling Using Unsimplified ... LexSiS: Lexical Simplification for ...
... improving its overall understandability. ... we focus on the lexical simplification, ... and to eliminate grammatical errors and misunderstandings.
The Impact of Lexical Simplification by Verbal ... and understandability, ... of the lexical quality of the text. While errors in text ...