Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F19%3APU132989" target="_blank" >RIV/00216305:26230/19:PU132989 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.sciencedirect.com/science/article/pii/S0885230818303607" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0885230818303607</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.csl.2019.06.004" target="_blank" >10.1016/j.csl.2019.06.004</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition
Popis výsledku v původním jazyce
In this work, we present an analysis of a DNN-based autoencoder for speech enhancement, dereverberation and denoising. Thetarget application is a robust speaker verification (SV) system. We start our approach by carefully designing a data augmentationprocess to cover a wide range of acoustic conditions and to obtain rich training data for various components of our SV system.We augment several well-known databases used in SV with artificially noised and reverberated data and we use them to train adenoising autoencoder (mapping noisy and reverberated speech to its clean version) as well as an x-vector extractor which is cur-rently considered as state-of-the-art in SV. Later, we use the autoencoder as a preprocessing step for a text-independent SV sys-tem. We compare results achieved with autoencoder enhancement, multi-condition PLDA training and their simultaneous use.We present a detailed analysis with various conditions of NIST SRE 2010, 2016, PRISM and with re-transmitted data. We con-clude that the proposed preprocessing can significantly improve both i-vector and x-vector baselines and that this technique canbe used to build a robust SV system for various target domains.
Název v anglickém jazyce
Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition
Popis výsledku anglicky
In this work, we present an analysis of a DNN-based autoencoder for speech enhancement, dereverberation and denoising. Thetarget application is a robust speaker verification (SV) system. We start our approach by carefully designing a data augmentationprocess to cover a wide range of acoustic conditions and to obtain rich training data for various components of our SV system.We augment several well-known databases used in SV with artificially noised and reverberated data and we use them to train adenoising autoencoder (mapping noisy and reverberated speech to its clean version) as well as an x-vector extractor which is cur-rently considered as state-of-the-art in SV. Later, we use the autoencoder as a preprocessing step for a text-independent SV sys-tem. We compare results achieved with autoencoder enhancement, multi-condition PLDA training and their simultaneous use.We present a detailed analysis with various conditions of NIST SRE 2010, 2016, PRISM and with re-transmitted data. We con-clude that the proposed preprocessing can significantly improve both i-vector and x-vector baselines and that this technique canbe used to build a robust SV system for various target domains.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2019
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
COMPUTER SPEECH AND LANGUAGE
ISSN
0885-2308
e-ISSN
1095-8363
Svazek periodika
2019
Číslo periodika v rámci svazku
58
Stát vydavatele periodika
GB - Spojené království Velké Británie a Severního Irska
Počet stran výsledku
19
Strana od-do
403-421
Kód UT WoS článku
000477663800022
EID výsledku v databázi Scopus
2-s2.0-85067550556