LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION OF CZECH LECTURES
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F08%3APU78523" target="_blank" >RIV/00216305:26230/08:PU78523 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION OF CZECH LECTURES
Original language description
This paper describes improvements in Automatic Speech Recognition (ASR) of Czech lectures obtained by enhancing language models. Our baseline is a statistical trigram language model with Good-Turing smoothing, trained on half billion words from newspapers, books etc. The overall improvement from adding more training data is over 10% in accuracy absolute, while using advanced language modeling techniques - mainly neural networks - yields another 3%. Perplexity improvements and OOV reduction are discussedtoo.
Czech name
Jazykové modely pro rozpoznávání českých přednášek
Czech description
Článek je o jazykovém modelování. <br>
Classification
Type
D - Article in proceedings
CEP classification
JC - Computer hardware and software
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/GA102%2F08%2F0707" target="_blank" >GA102/08/0707: Speech Recognition under Real-World Conditions</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2008
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proc. STUDENT EEICT 2008
ISBN
978-80-214-3617-6
ISSN
—
e-ISSN
—
Number of pages
5
Pages from-to
—
Publisher name
Faculty of Electrical Engineering and Communication BUT
Place of publication
Brno
Event location
FEKT VUT v Brně
Event date
Apr 24, 2008
Type of event by nationality
CST - Celostátní akce
UT code for WoS article
—