Accuracy of MP3 Speech Recognition Under Real-World Conditions. Experimental Study
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F11%3A00181583" target="_blank" >RIV/68407700:21230/11:00181583 - isvavai.cz</a>
Výsledek na webu
<a href="http://www.sigmap.icete.org" target="_blank" >http://www.sigmap.icete.org</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Accuracy of MP3 Speech Recognition Under Real-World Conditions. Experimental Study
Popis výsledku v původním jazyce
This paper presents the study of speech recognition accuracy with respect to different levels of MP3 compression. Special attention is focused on the processing of speech signals with different quality, i.e. with different level of background noise and channel distortion. The work was motivated by possible usage of ASR for offline automatic transcription of audio recordings collected by standard wide-spread MP3 devices. The realized experiments have proved that although MP3 format does not distort speech significantly especially for high or moderate bit rates and high quality of source data. The accuracy of connected digits ASR decreased very slowly up to the bit rate 24 kbps. For the best case of PLP parameterization in close-talk channel just 3% decrease of recognition accuracy was observed while the size of the compressed file was approximately 10% of the original size. All results were slightly worse under presence of additive background noise and channel distortion.
Název v anglickém jazyce
Accuracy of MP3 Speech Recognition Under Real-World Conditions. Experimental Study
Popis výsledku anglicky
This paper presents the study of speech recognition accuracy with respect to different levels of MP3 compression. Special attention is focused on the processing of speech signals with different quality, i.e. with different level of background noise and channel distortion. The work was motivated by possible usage of ASR for offline automatic transcription of audio recordings collected by standard wide-spread MP3 devices. The realized experiments have proved that although MP3 format does not distort speech significantly especially for high or moderate bit rates and high quality of source data. The accuracy of connected digits ASR decreased very slowly up to the bit rate 24 kbps. For the best case of PLP parameterization in close-talk channel just 3% decrease of recognition accuracy was observed while the size of the compressed file was approximately 10% of the original size. All results were slightly worse under presence of additive background noise and channel distortion.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
JA - Elektronika a optoelektronika, elektrotechnika
OECD FORD obor
—
Návaznosti výsledku
Projekt
<a href="/cs/project/GA102%2F08%2F0707" target="_blank" >GA102/08/0707: Rozpoznávání mluvené řeči v reálných podmínkách</a><br>
Návaznosti
Z - Vyzkumny zamer (s odkazem do CEZ)
Ostatní
Rok uplatnění
2011
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of SIGMAP 2011 - International Conference on Signal Processing and Multimedia Applications.
ISBN
978-989-8425-72-0
ISSN
—
e-ISSN
—
Počet stran výsledku
6
Strana od-do
5-10
Název nakladatele
University of Seville
Místo vydání
Sevilla
Místo konání akce
Sevilla
Datum konání akce
18. 7. 2011
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—