Published on March 1, 2014
Departamento de Ciencias de la Computación Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas Universidad Nacional Autónoma de México Practical Speech Recognition for Contextualized Service Robots Ivan Meza, Caleb Rascón and Luis Pineda http://golem.iimas.unam.mx/ GrupoGolem
Service robots ● Our future butlers ● They are task oriented ○ Clean up a room ○ Play a game ● ● ● ● Interaction with spoken language They work in noisy environments Microphone is not close to the speaker Poor speech recognition
Proposal ● Improve the system on four aspects ● Contextualized recogniser ● Prompting strategies ● Recovery strategies ● Audio calibration
I. Contextualized recognition ● Use specific language models for the given expectations ■ YES: yes, okay, all right ■ NO: no, don’t, do not ■ NAVIGATE: go to the kitchen, go to the living room, go to the bedroom
II. Prompting strategies ● Let know the user when to speak ■ Beep sound ● Speaker volume monitor ■ Could you speak louder or softer
III. Recovery strategy ● Let know the user when something went wrong ■ could you repeat? ■ i can’t hear you well, could you repeat ■ sorry, i’m a little deaf
IV. Calibration of audio setting ● Hardware ■ 1 directional microphone ■ 1 USB interface with 4 channels ■ 2 speakers ● Calibration of SNR in situ ■ For background noise -58dB ■ SNR set to 20 dB
Corpus evaluation ● Logs from the robot performing RoboCup tasks ■ ■ ■ ■ ■ ■ ■ 2 years interactions in lab and competition 1,439 utterances 2,472 tokens 120 types 11 tasks 9 of 11 tasks are contextualized 14 language models
Contextualized recognition We measure WER (the lower the better) ● With a unique LM for all tasks: 53.84% ● With task-based LM: 28.28% ● With contextualized: 23.42% 17.2% relative error reduction
Beep sound ● 79 utterances were recorded without the beep sound ■ Without beeps 55.86% ■ With beeps 39.75% ■ With beeps full 53.72% 30%-4% Relative error reduction
Usage of SoundLoc System ● We measure usage ■ 174 times could have been triggered ■ 21 soft speech ■ 4 louder 14.36% of the times
Recovery strategy ● We measure usage ■ 504 times could have been triggered ■ 85 times activated 16.87% of the times
Conclusions ● These strategies help to improve in small amounts the performance ● Together they allow practical speech recognition on a service robot
Thank you ● ¿Questions?
Presentación que realice en el Evento Nacional de Gobierno Abierto, realizado los ...
In this presentation we will describe our experience developing with a highly dyna...
Presentation to the LITA Forum 7th November 2014 Albuquerque, NM
Un recorrido por los cambios que nos generará el wearabletech en el futuro
Um paralelo entre as novidades & mercado em Wearable Computing e Tecnologias Assis...
Practical Speech Recognition for Contextualized ... this strategy contextualized speech recognition. ... .) For the second case ...
Practical Speech Recognition ... 4 Contextualized Speech Recognition ... (otherrobotsfollowasimilarstrategy.)
... we present the speech recognition module of a service robot ... Practical Speech Recognition for Contextualized Service ... MICAI 2013 , Mexico City ...
Micai 13 contextualized practical speech Talk given in MICAI 2013: Practical Speech Recognition for... 7 months ago. Education.
The Practical Speech Recognition for Contextualized Service ... .) For the second case, we ... we have presented three strategies that improve ...
Micai 13 contextualized practical speech Talk given in MICAI 2013: Practical Speech Recognition for Contextualized Service Robots lake ...
Noviyanti Syahri, S.S. DIRECT - INDIRECT SPEECH Definition: Quotation from what someone says. Kinds: 1. ... Share 13. Structure, Direct-Indirect Speech.