Lexical Simplification. Improving Understandability. Reducing Errors.

50 %
50 %
Information about Lexical Simplification. Improving Understandability. Reducing Errors.
Technology

Published on February 20, 2014

Author: mattshardlow

Source: slideshare.net

Description

A talk I gave at the Language@Leeds seminar series on the 13th Feb. 2014. All about my research. What is lexical simplification? What do the errors in the lexical simplification pipeline look like? How do we quantify and mitigate these errors?

Lexical Simplification Improving understandability, Reducing Errors. Matt Shardlow University of Manchester lexicalsimplification.blogspot.co.uk

The Problem ● Text is everywhere ● But is it understandable? ● Simplification as assistive technology. ● Concentrating on vocabulary 2

Examples ● Technical Medical Language – – ● Hypertension risk factors include obesity,... High blood pressure risk factors include excessive weight,... Legal Language – – ● The Products transacted through the Service are... The Products managed through the Service are... Low Literacy Readers – The plan resolved to increase the frequency of flooding... – The plan determined to raise the amount of flooding... 3

How? ● Practical Simplification of English Text (PSET) – – Replaces low frequency words with high frequency – ● Lexical simplification (Carroll et al. '98) Paraphrasing – Medical language (Deléger and Zweigenbaum '09) – Wikipedia (Yatskar et al. '10) 4

How? ● Word Sense Disambiguation – – ● Language modelling (De Belder and Moens '10) WordNet (Thomas and Anderson '12) Ranking Task – SemEval-2012: Task 1 (Specia et al. '12) – Shared data and evaluation measure. 5

The Pipeline Identification of Complex Words … the frequency of floods Generation of Substitutions Frequency: incidence, amount, recurrence, repetition, prevalence, Word Sense Disambiguation Frequency: incidence, amount, recurrence, repetition, prevalence, Ranking Synonyms by Simplicity 1) amount 2) repetition 3) recurrence 6

Errors in the pipeline ● The event's legacy “hangs in the balance”. ● The event's legacy “falls ● … a vote paving ● … a vote pavement the way for military action. ● … the understanding of neurological disorders. ● … the understanding of neurological disorders. in the balance”. the way for military action. 7

An Experiment ● Baseline simplification system ● Corpus of news texts ● Error categories 8

Error Categories ● Type 1: No error ● Type 2A: An undetected complex word ● Type 2B: A simple word identified as complex ● Type 3A: No substitutions available ● Type 3B: No simpler substitutions available ● Type 4: A change in meaning ● Type 5: The resulting text is more difficult 9

The Pipeline (2) Identification of Complex Words Type 2A / 2B Generation of Substitutions Type 3A / 3B Word Sense Disambiguation Type 4 Ranking Synonyms by Simplicity Type 5 10

Annotation ● Verbose system output ● Single annotator ● One sitting ● Annotations – Recorded – Cross validated 11

Error Distribution - Raw 12

Error Distribution – Independence 13

Where to? ● Working to mitigate the errors ● Evaluation for each pipeline step ● Domain adaptability ● Personal Integration 14

Any Questions References ● J. Carrol, G. Minnen, Y. Canning, S. Devlin, and J. Tait. Practical simplif cation of english newspaper text to assist aphasic readers. AAAI i 1998. ● L. Deléger and P. Zweigenbaum. Extracting lay paraphrases of specialized expressions from monolingual comparable medical corpora. BUCC 2009. ● J. De Belder and M. Moens. Text simplif cation for children. In Proceedings i of the SIGIR workshop on accessible search systems, 2010. ● S. R. Thomas and S. Anderson. WordNet-based lexical simplif cation of a i document. KONVENS 2012. ● L. Specia, S. K. Jauhar, and R. Mihalcea. Semeval-2012 Task 1: English lexical simplif cation. SemEval, 2012. i 15

IAA ● Post-hoc Inter Annotator study ● 3 Annotators ● 20 sentences ● Same format of annotation as first experiment ● Fleiss' kappa calculated ● Moderate kappa agreement of 0.3556 16

Add a comment

Related presentations

Related pages

Out in the Open: Finding and Categorising Errors in the ...

... Finding and Categorising Errors in the Lexical Simplification ... improving the comprehensibility of ... amount of work towards reducing errors due to ...
Read more

What is lexical and semantic errors? - Find Answers Here!

Lexical Simplification. Improving Understandability ... All about my research. What is lexical simplification? What do the errors in the le ...
Read more

A Survey of Automated Text Simplification

RESEARCH EFFORTS IN LEXICAL SIMPLIFICATION ... lexical simplification by improving the ... whilst improving readability and understandability.
Read more

CodePro Analytix User Guide | Codepro AnalytiX | Google ...

CodePro Analytix User Guide ... concerned about improving software quality and reducing ... developer in reducing errors as the code ...
Read more

IMPROVING THE UNDERSTANDABILITY OF SPEECH SYNTHESIS BY ...

IMPROVING THE UNDERSTANDABILITY OF SPEECH SYNTHESIS ... on understandability, including lexical choice, ... ing some errors, ...
Read more

Simplifying words in context. Experiments with two lexical ...

... lexical resources for selecting synonyms and strategies for word sense disambiguation in a lexical simplification ... for reducing the ...
Read more

Automatic Text Simplification - An Unannotated Bibliography

text simplification bibliography, ... Improving Text Simplification Language Modeling Using Unsimplified ... LexSiS: Lexical Simplification for ...
Read more

As Simple as It Gets - A Sentence Simplifier for Different ...

... improving its overall understandability. ... we focus on the lexical simplification, ... and to eliminate grammatical errors and misunderstandings.
Read more

The Impact of Lexical Simplification by Verbal Paraphrases ...

The Impact of Lexical Simplification by Verbal ... and understandability, ... of the lexical quality of the text. While errors in text ...
Read more