13 years of speaker recognition research at BUT, with longitudinal analysis of NIST SRE
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F20%3APU135811" target="_blank" >RIV/00216305:26230/20:PU135811 - isvavai.cz</a>
Alternative codes found
RIV/00216305:26230/19:PU135811
Result on the web
<a href="https://www.sciencedirect.com/science/article/pii/S0885230819302797?via%3Dihub" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0885230819302797?via%3Dihub</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.csl.2019.101035" target="_blank" >10.1016/j.csl.2019.101035</a>
Alternative languages
Result language
angličtina
Original language name
13 years of speaker recognition research at BUT, with longitudinal analysis of NIST SRE
Original language description
In this paper, we present a brief history and a "longitudinal study" of all important milestone modelling techniques used in text independent speaker recognition since Brno University of Technology (BUT) first participated in the NIST Speaker Recognition Evaluation (SRE) in 2006-GMM MAP, GMM MAP with eigen-channel adaptation, Joint Factor Analysis, i-vector and DNN embedding (x-vector). To emphasize the historical context, the techniques are evaluated on all NIST SRE sets since 2004 on a time-machine principle, i.e. a system is always trained using all data available up till the year of evaluation. Moreover, as user-contributed audiovisual content dominates nowadays Internet, we representatively include the Speakers In The Wild (SITW) and VOiCES challenge datasets in the evaluation of our systems. Not only we present a comparison of the modelling techniques, but we also show the effect of sampling frequency.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/VI20152020025" target="_blank" >VI20152020025: Information mining in speech acquired by distant microphones - DRAPÁK</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2020
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
COMPUTER SPEECH AND LANGUAGE
ISSN
0885-2308
e-ISSN
1095-8363
Volume of the periodical
2020
Issue of the periodical within the volume
63
Country of publishing house
GB - UNITED KINGDOM
Number of pages
15
Pages from-to
1-15
UT code for WoS article
000534481900003
EID of the result in the Scopus database
2-s2.0-85080857173