Investigation on Most Frequent Errors in Large-Scale Speech Recognition Applications
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F12%3A%230002009" target="_blank" >RIV/46747885:24220/12:#0002009 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Investigation on Most Frequent Errors in Large-Scale Speech Recognition Applications
Original language description
When automatic speech recognition (ASR) system is being developed for an application where a large amount of audio documents is to be transcribed, we need some feedback information that tells us, what the main types of errors are, why and where they occur and what can be done to eliminate them. While the algorithm commonly used for counting the number of word errors is simple, it does not care much about the nature and source of the errors. In this paper, we introduce a scheme that offers a more detailed insight into analysis of ASR errors. We apply it to the performance evaluation of a Czech ASR system whose main goal is to transcribe oral archives containing hundreds of thousands spoken documents. The analysis is performed by comparing 763 hours of manually and automatically transcribed data. We list the main types of errors and present methods that try to eliminate at least the most relevant ones. We show that the proposed error locating method can be useful also when porting an exi
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
JC - Computer hardware and software
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/DF11P01OVV013" target="_blank" >DF11P01OVV013: Disclosure of the Czech Radio archive for sophisticated search</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2012
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proc. of Text, Speech and Dialogue (TSD),
ISBN
978-3-642-32789-6
ISSN
—
e-ISSN
—
Number of pages
8
Pages from-to
520-527
Publisher name
Springer, Berlin Heidelberg
Place of publication
Berlín, Německo
Event location
Česká Republika
Event date
Jan 1, 2012
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—