Study on the use and adaptation of bottleneck features for robust speech recognition of nonlinearly distorted speech
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F16%3A00000301" target="_blank" >RIV/46747885:24220/16:00000301 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.5220/0005955500650071" target="_blank" >http://dx.doi.org/10.5220/0005955500650071</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.5220/0005955500650071" target="_blank" >10.5220/0005955500650071</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Study on the use and adaptation of bottleneck features for robust speech recognition of nonlinearly distorted speech
Popis výsledku v původním jazyce
This paper focuses on the robust recognition of nonlinearly distorted speech. We have previously reported that hybrid acoustic models based on a combination of Hidden Markov Models and Deep Neural Networks (HMM-DNNs) are better suited to this task than conventional HMMs utilizing Gaussian Mixture Models (HMM-GMMs). To further improve recognition accuracy, this paper investigates the possibility of combining the modeling power of deep neural networks with the adaptation to given acoustic conditions. For this purpose, the deep neural networks are utilized to produce bottleneck coefficients / features (BNC). The BNCs are subsequently used for training of HMM-GMM based acoustic models and then adapted using Constrained Maximum Likelihood Linear Regression (CMLLR). Our results obtained for three types of nonlinear distortions and three types of input features show that the adapted BNC-based system (a) outperforms HMM-DNN acoustic models in the case of strong compression and (b) yields comparable performance for speech affected by nonlinear amplification in the analog domain.
Název v anglickém jazyce
Study on the use and adaptation of bottleneck features for robust speech recognition of nonlinearly distorted speech
Popis výsledku anglicky
This paper focuses on the robust recognition of nonlinearly distorted speech. We have previously reported that hybrid acoustic models based on a combination of Hidden Markov Models and Deep Neural Networks (HMM-DNNs) are better suited to this task than conventional HMMs utilizing Gaussian Mixture Models (HMM-GMMs). To further improve recognition accuracy, this paper investigates the possibility of combining the modeling power of deep neural networks with the adaptation to given acoustic conditions. For this purpose, the deep neural networks are utilized to produce bottleneck coefficients / features (BNC). The BNCs are subsequently used for training of HMM-GMM based acoustic models and then adapted using Constrained Maximum Likelihood Linear Regression (CMLLR). Our results obtained for three types of nonlinear distortions and three types of input features show that the adapted BNC-based system (a) outperforms HMM-DNN acoustic models in the case of strong compression and (b) yields comparable performance for speech affected by nonlinear amplification in the analog domain.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
JC - Počítačový hardware a software
OECD FORD obor
—
Návaznosti výsledku
Projekt
<a href="/cs/project/TA04010199" target="_blank" >TA04010199: MULTILINMEDIA - Multilinguální platforma pro monitoring a analýzu multimédií</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2016
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proc. of 13th International Conference on Signal Processing and Multimedia Applications (SIGMAP 2016)
ISBN
978-989-758-196-0
ISSN
—
e-ISSN
—
Počet stran výsledku
7
Strana od-do
65-71
Název nakladatele
SciTePress
Místo vydání
Lisabon, Portugalsko
Místo konání akce
Lisabon, Portugalsko
Datum konání akce
1. 1. 2016
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000391091400006