Unified Approach to Development of ASR Systems for East Slavic Languages
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F17%3A00004820" target="_blank" >RIV/46747885:24220/17:00004820 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1007/978-3-319-68456-7_16" target="_blank" >http://dx.doi.org/10.1007/978-3-319-68456-7_16</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-68456-7_16" target="_blank" >10.1007/978-3-319-68456-7_16</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Unified Approach to Development of ASR Systems for East Slavic Languages
Popis výsledku v původním jazyce
This paper deals with the development of language specific modules (lexicons, phonetic inventories, LMs and AMs) for Russian, Ukrainian and Belarusian (used by 260M, 45M and 3M native speakers, respectively). Instead of working on each language separately, we adopt a common approach that allows us to share data and tools, yet taking into account language unique features. We utilize only freely available text and audio data that can be found on web pages of major newspaper and broadcast publishers. This must be done with large care, as the 3 languages are often mixed in spoken and written media. So, one component of the automated training process is a language identification module. At the output of the complete process there are 3 pronunciation lexicons (each about 300K words), 3 partly shared phoneme sets, and corresponding acoustic (DNN) and language (N-gram) models. We employ them in our media monitoring system and provide results achieved on a test set made of several complete TV news in all the 3 languages. The WER values vary in range from 24 to 36%.
Název v anglickém jazyce
Unified Approach to Development of ASR Systems for East Slavic Languages
Popis výsledku anglicky
This paper deals with the development of language specific modules (lexicons, phonetic inventories, LMs and AMs) for Russian, Ukrainian and Belarusian (used by 260M, 45M and 3M native speakers, respectively). Instead of working on each language separately, we adopt a common approach that allows us to share data and tools, yet taking into account language unique features. We utilize only freely available text and audio data that can be found on web pages of major newspaper and broadcast publishers. This must be done with large care, as the 3 languages are often mixed in spoken and written media. So, one component of the automated training process is a language identification module. At the output of the complete process there are 3 pronunciation lexicons (each about 300K words), 3 partly shared phoneme sets, and corresponding acoustic (DNN) and language (N-gram) models. We employ them in our media monitoring system and provide results achieved on a test set made of several complete TV news in all the 3 languages. The WER values vary in range from 24 to 36%.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
20204 - Robotics and automatic control
Návaznosti výsledku
Projekt
<a href="/cs/project/TA04010199" target="_blank" >TA04010199: MULTILINMEDIA - Multilinguální platforma pro monitoring a analýzu multimédií</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISBN
9783319684550
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
11
Strana od-do
193-203
Název nakladatele
Springer Verlag
Místo vydání
Německo
Místo konání akce
Le Mans, Francie
Datum konání akce
1. 1. 2017
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—