End-to-end DNN based text-independent speaker recognition for long and short utterances
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F19%3APU132988" target="_blank" >RIV/00216305:26230/19:PU132988 - isvavai.cz</a>
Result on the web
<a href="https://www.sciencedirect.com/science/article/pii/S0885230818303632" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0885230818303632</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.csl.2019.06.002" target="_blank" >10.1016/j.csl.2019.06.002</a>
Alternative languages
Result language
angličtina
Original language name
End-to-end DNN based text-independent speaker recognition for long and short utterances
Original language description
Recently several end-to-end speaker verification systems based on deep neural networks (DNNs) have been proposed. These systems have been proven to be competitive for text-dependent tasks as well as for text-independent tasks with short utterances. However, for text-independent tasks with longer utterances, end-to-end systems are still outperformed by standard i-vector + PLDA systems. In this work, we present an end-to-end speaker verification system that is initialized to mimic an i-vector + PLDA baseline. The system is then further trained in an end-to-end manner but regularized so that it does not deviate too far from the initial system. In this way we mitigate overfitting which normally limits the performance of end-to-end systems. The proposed system outperforms the i-vector + PLDA baseline on both long and short duration utterances.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
COMPUTER SPEECH AND LANGUAGE
ISSN
0885-2308
e-ISSN
1095-8363
Volume of the periodical
2020
Issue of the periodical within the volume
59
Country of publishing house
GB - UNITED KINGDOM
Number of pages
14
Pages from-to
22-35
UT code for WoS article
000490540900002
EID of the result in the Scopus database
2-s2.0-85067618095