Parliament Archives Used for Automatic Training of Multi-lingual Automatic Speech Recognition Systems
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F17%3A00004815" target="_blank" >RIV/46747885:24220/17:00004815 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.1007/978-3-319-64206-2_20" target="_blank" >http://dx.doi.org/10.1007/978-3-319-64206-2_20</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-64206-2_20" target="_blank" >10.1007/978-3-319-64206-2_20</a>
Alternative languages
Result language
angličtina
Original language name
Parliament Archives Used for Automatic Training of Multi-lingual Automatic Speech Recognition Systems
Original language description
In the paper we present a fully automated process capable of creating speech databases needed for training acoustic models for speech recognition systems. We show that archives of national parliaments are perfect sources of speech and text data suited for a lightly supervised training scheme, which does not require human intervention. We describe the process and its procedures in details and demonstrate its usage on three Slavic languages (Polish, Russian and Bulgarian). Practical evaluation is done on a broadcast news task and yields better results than those obtained on some established speech databases.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
20204 - Robotics and automatic control
Result continuities
Project
<a href="/en/project/TA04010199" target="_blank" >TA04010199: MULTILINMEDIA - Multilingual Multimedia Monitoring and Analyzing Platform</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2017
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISBN
9783319642055
ISSN
0302-9743
e-ISSN
—
Number of pages
9
Pages from-to
174-182
Publisher name
Springer Verlag
Place of publication
Německo
Event location
Praha, Česká Republika
Event date
Jan 1, 2017
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—