Utilizing Lipreading in Large Vocabulary Continuous Speech Recognition

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F17%3A00004828" target="_blank" >RIV/46747885:24220/17:00004828 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1007/978-3-319-66429-3_77" target="_blank" >http://dx.doi.org/10.1007/978-3-319-66429-3_77</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-66429-3_77" target="_blank" >10.1007/978-3-319-66429-3_77</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Utilizing Lipreading in Large Vocabulary Continuous Speech Recognition
Popis výsledku v původním jazyce
Vast majority of current research in the area of audiovisual speech recognition via lipreading from frontal face videos focuses on simple cases such as isolated phrase recognition or structured speech, where the vocabulary is limited to several tens of units. In this paper, we diverge from these traditional applications and investigate the effect of incorporating the visual information in the task of continuous speech recognition with vocabulary size ranging from several hundred to half a million words. To this end, we evaluate various visual speech parametrizations, both existing and novel, that are designed to capture different kind of information in the video signal. The experiments are conducted on a moderate sized dataset of 54 speakers, each uttering 100 sentences in Czech language. We show that even for large vocabularies the visual signal contains enough information to improve the word accuracy up to 15% relatively to the acoustic-only recognition.
Název v anglickém jazyce
Utilizing Lipreading in Large Vocabulary Continuous Speech Recognition
Popis výsledku anglicky
Vast majority of current research in the area of audiovisual speech recognition via lipreading from frontal face videos focuses on simple cases such as isolated phrase recognition or structured speech, where the vocabulary is limited to several tens of units. In this paper, we diverge from these traditional applications and investigate the effect of incorporating the visual information in the task of continuous speech recognition with vocabulary size ranging from several hundred to half a million words. To this end, we evaluate various visual speech parametrizations, both existing and novel, that are designed to capture different kind of information in the video signal. The experiments are conducted on a moderate sized dataset of 54 speakers, each uttering 100 sentences in Czech language. We show that even for large vocabularies the visual signal contains enough information to improve the word accuracy up to 15% relatively to the acoustic-only recognition.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
20204 - Robotics and automatic control

Návaznosti výsledku

Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 19th International Conference on Speech and Computer, SPECOM 2017
ISBN
9783319664286
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
10
Strana od-do
767-776
Název nakladatele
Springer Verlag
Místo vydání
Spolková republika Německo
Místo konání akce
Hatfield; United Kingdom
Datum konání akce
1. 1. 2017
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Experimenting With Lipreading For Large Vocabulary Continuous Speech Recognition Multimodal Name Recognition in Live TV Subtitling Bimodal speech recognition fusing audio-visual modalities

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Utilizing Lipreading in Large Vocabulary Continuous Speech Recognition

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)