Study on the use and adaptation of bottleneck features for robust speech recognition of nonlinearly distorted speech

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F16%3A00000301" target="_blank" >RIV/46747885:24220/16:00000301 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.5220/0005955500650071" target="_blank" >http://dx.doi.org/10.5220/0005955500650071</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.5220/0005955500650071" target="_blank" >10.5220/0005955500650071</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Study on the use and adaptation of bottleneck features for robust speech recognition of nonlinearly distorted speech
Popis výsledku v původním jazyce
This paper focuses on the robust recognition of nonlinearly distorted speech. We have previously reported that hybrid acoustic models based on a combination of Hidden Markov Models and Deep Neural Networks (HMM-DNNs) are better suited to this task than conventional HMMs utilizing Gaussian Mixture Models (HMM-GMMs). To further improve recognition accuracy, this paper investigates the possibility of combining the modeling power of deep neural networks with the adaptation to given acoustic conditions. For this purpose, the deep neural networks are utilized to produce bottleneck coefficients / features (BNC). The BNCs are subsequently used for training of HMM-GMM based acoustic models and then adapted using Constrained Maximum Likelihood Linear Regression (CMLLR). Our results obtained for three types of nonlinear distortions and three types of input features show that the adapted BNC-based system (a) outperforms HMM-DNN acoustic models in the case of strong compression and (b) yields comparable performance for speech affected by nonlinear amplification in the analog domain.
Název v anglickém jazyce
Study on the use and adaptation of bottleneck features for robust speech recognition of nonlinearly distorted speech
Popis výsledku anglicky
This paper focuses on the robust recognition of nonlinearly distorted speech. We have previously reported that hybrid acoustic models based on a combination of Hidden Markov Models and Deep Neural Networks (HMM-DNNs) are better suited to this task than conventional HMMs utilizing Gaussian Mixture Models (HMM-GMMs). To further improve recognition accuracy, this paper investigates the possibility of combining the modeling power of deep neural networks with the adaptation to given acoustic conditions. For this purpose, the deep neural networks are utilized to produce bottleneck coefficients / features (BNC). The BNCs are subsequently used for training of HMM-GMM based acoustic models and then adapted using Constrained Maximum Likelihood Linear Regression (CMLLR). Our results obtained for three types of nonlinear distortions and three types of input features show that the adapted BNC-based system (a) outperforms HMM-DNN acoustic models in the case of strong compression and (b) yields comparable performance for speech affected by nonlinear amplification in the analog domain.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
JC - Počítačový hardware a software
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/TA04010199" target="_blank" >TA04010199: MULTILINMEDIA - Multilinguální platforma pro monitoring a analýzu multimédií</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2016
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Proc. of 13th International Conference on Signal Processing and Multimedia Applications (SIGMAP 2016)
ISBN
978-989-758-196-0
ISSN
—
e-ISSN
—
Počet stran výsledku
7
Strana od-do
65-71
Název nakladatele
SciTePress
Místo vydání
Lisabon, Portugalsko
Místo konání akce
Lisabon, Portugalsko
Datum konání akce
1. 1. 2016
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000391091400006

Podobné výsledky(10)

Investigation of Deep Neural Networks for Robust Recognition of Nonlinearly Distorted Speech Applying articulatory features within speech recognition On a Hybrid NN/HMM Speech Recognition System with a RNN-Based Language Model

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Study on the use and adaptation of bottleneck features for robust speech recognition of nonlinearly distorted speech

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)