Language models for spontaneous speech recognition
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F15%3A00230233" target="_blank" >RIV/68407700:21230/15:00230233 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Language models for spontaneous speech recognition
Popis výsledku v původním jazyce
The paper presents the creation of n-gram Language Models (LMs) for the purposes of spontaneous speech recognition with the special focus on the recognition performed with the data from the Nijmegen Corpus of Casual Czech (NCCCz). Required LMs which cover spontaneous or casual speech re- spectively were created using the data available in the col- lected corpus NCCCz. These models were combined with a general model built on the basis Czech National Corpus. The creation of LMs, their adaptation to specific thematic do- main (casual speech) were created using SRILM toolkit. The quality of created LMs was measured on the basis Perplex- ity and Out-Of-Vocabulary words at the text level, same as they was used within the automatic speech recognition and theachieved Word Error Rates are also presented. The study shows that created LMs were suitable for given purpose and achieved results were related to other works presenting the spontaneous speech recognition.
Název v anglickém jazyce
Language models for spontaneous speech recognition
Popis výsledku anglicky
The paper presents the creation of n-gram Language Models (LMs) for the purposes of spontaneous speech recognition with the special focus on the recognition performed with the data from the Nijmegen Corpus of Casual Czech (NCCCz). Required LMs which cover spontaneous or casual speech re- spectively were created using the data available in the col- lected corpus NCCCz. These models were combined with a general model built on the basis Czech National Corpus. The creation of LMs, their adaptation to specific thematic do- main (casual speech) were created using SRILM toolkit. The quality of created LMs was measured on the basis Perplex- ity and Out-Of-Vocabulary words at the text level, same as they was used within the automatic speech recognition and theachieved Word Error Rates are also presented. The study shows that created LMs were suitable for given purpose and achieved results were related to other works presenting the spontaneous speech recognition.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
JA - Elektronika a optoelektronika, elektrotechnika
OECD FORD obor
—
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2015
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of the 19th International Scientific Student Conferenece POSTER 2015
ISBN
978-80-01-05499-4
ISSN
—
e-ISSN
—
Počet stran výsledku
4
Strana od-do
1-4
Název nakladatele
Czech Technical University in Prague
Místo vydání
Praha
Místo konání akce
Praha
Datum konání akce
14. 5. 2015
Typ akce podle státní příslušnosti
EUR - Evropská akce
Kód UT WoS článku
—