Employment of Subspace Gaussian Mixture Models in Speaker Recognition

The result's identifiers

Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F15%3APU117038" target="_blank" >RIV/00216305:26230/15:PU117038 - isvavai.cz</a>
Result on the web
<a href="https://ieeexplore.ieee.org/document/7178811" target="_blank" >https://ieeexplore.ieee.org/document/7178811</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ICASSP.2015.7178811" target="_blank" >10.1109/ICASSP.2015.7178811</a>

Alternative languages

Result language
angličtina
Original language name
Employment of Subspace Gaussian Mixture Models in Speaker Recognition
Original language description
This paper presents Subspace Gaussian Mixture Model (SGMM) approach employed as a probabilistic generative model to estimate speaker vector representations to be subsequently used in the speaker verification task. SGMMs have already been shown to significantly outperform traditional HMM/GMMs in Automatic Speech Recognition (ASR) applications. An extension to the basic SGMM framework allows to robustly estimate low-dimensional speaker vectors and exploit them for speaker adaptation. We propose a speaker verification framework based on low-dimensional speaker vectors estimated using SGMMs, trained in ASR manner using manual transcriptions. To test the robustness of the system, we evaluate the proposed approach with respect to the state-of-the-art i-vector extractor on the NIST SRE 2010 evaluation set and on four different length-utterance conditions: 3sec-10sec, 10 sec-30 sec, 30 sec-60 sec and full (untruncated) utterances. Experimental results reveal that while i-vector system performs better on truncated 3sec to 10sec and 10 sec to 30 sec utterances, noticeable improvements are observed with SGMMs especially on full length-utterance durations. Eventually, the proposed SGMM approach exhibits complementary properties and can thus be efficiently fused with i-vector based speaker verification system.
Czech name
—
Czech description
—

Classification

Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

Publication year
2015
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

Article name in the collection
Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing
ISBN
978-1-4673-6997-8
ISSN
—
e-ISSN
—
Number of pages
5
Pages from-to
4445-4449
Publisher name
IEEE Signal Processing Society
Place of publication
South Brisbane, Queensland
Event location
Brisbane
Event date
Apr 19, 2015
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000427402904111

Similar results(10)

SdSV Challenge 2020: Large-Scale Evaluation of Short-duration Speaker Verification End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA End-to-end DNN based text-independent speaker recognition for long and short utterances

What are you looking for?

Quick search

Smart search

Employment of Subspace Gaussian Mixture Models in Speaker Recognition

The result's identifiers

Alternative languages

Classification

Result continuities

Others

Data specific for result type

Similar results(10)

What are you looking for?

Quick search

Smart search

Result description

The result's identifiers

The result's identifiers

Alternative languages

Alternative languages

Classification

Classification

Result continuities

Result continuities

Others

Others

Data specific for result type

Data specific for result type

Similar results(10)