Coverage of Spontaneous Conversational Speech from Nijmegen Corpus of Casual Czech by General ASR Language Models

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F11%3A00185972" target="_blank" >RIV/68407700:21230/11:00185972 - isvavai.cz</a>
Výsledek na webu
<a href="http://mirjamernestus.ruhosting.nl/Ernestus/Workshop2011.php" target="_blank" >http://mirjamernestus.ruhosting.nl/Ernestus/Workshop2011.php</a>
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Coverage of Spontaneous Conversational Speech from Nijmegen Corpus of Casual Czech by General ASR Language Models
Popis výsledku v původním jazyce
The Large Vocabulary Continuous Speech Recognition (LVCSR) as one of the frequent applications of speech technology is being applied nowadays in growing number of applications in everyday human life. Consequently, also the need of spontaneous speech recognition arises, however, such speech has strongly different character in comparison to non-spontaneous speech. Then such specific phenomena are not supposed to be covered by standard general Language Model (LM). In this contribution we will analyze Nijmegen Corpus of Causal Czech (NCCCz) from the point of view of several LMs which are publicly available. We will analyze the rate of Out-Of-Vocabulary (OOV) words, the rate of word fractions, repetitions, or repeated starts, the perplexity computed at textlevel above transcription of NCCCz, LVCSR performance above recordings using above mentioned LMs.
Název v anglickém jazyce
Coverage of Spontaneous Conversational Speech from Nijmegen Corpus of Casual Czech by General ASR Language Models
Popis výsledku anglicky
The Large Vocabulary Continuous Speech Recognition (LVCSR) as one of the frequent applications of speech technology is being applied nowadays in growing number of applications in everyday human life. Consequently, also the need of spontaneous speech recognition arises, however, such speech has strongly different character in comparison to non-spontaneous speech. Then such specific phenomena are not supposed to be covered by standard general Language Model (LM). In this contribution we will analyze Nijmegen Corpus of Causal Czech (NCCCz) from the point of view of several LMs which are publicly available. We will analyze the rate of Out-Of-Vocabulary (OOV) words, the rate of word fractions, repetitions, or repeated starts, the perplexity computed at textlevel above transcription of NCCCz, LVCSR performance above recordings using above mentioned LMs.

Klasifikace

Druh
O - Ostatní výsledky
CEP obor
JA - Elektronika a optoelektronika, elektrotechnika
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/GA102%2F08%2F0707" target="_blank" >GA102/08/0707: Rozpoznávání mluvené řeči v reálných podmínkách</a><br>
Návaznosti
Z - Vyzkumny zamer (s odkazem do CEZ)

Ostatní

Rok uplatnění
2011
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Podobné výsledky(10)

Language models for spontaneous speech recognition Impact of Irregular Pronunciation on Phonetic Segmentation of Nijmegen Corpus of Casual Czech Priznaky na bazi TRAP pro LVCSR meetingovych dat

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Coverage of Spontaneous Conversational Speech from Nijmegen Corpus of Casual Czech by General ASR Language Models

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Podobné výsledky(10)