On Complementarity of State-of-the-art Speaker Recognition Systems
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F12%3A43916022" target="_blank" >RIV/49777513:23520/12:43916022 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
On Complementarity of State-of-the-art Speaker Recognition Systems
Popis výsledku v původním jazyce
In this paper recent methods used in the task of Speaker Recognition (SR) are reviewed and their complementarity is analysed. At first, methods based on Supervectors (SVs) related to Gaussian Mixture Models (GMMs) and Support Vector Machines (SVMs) usedas a discriminative model are described along with the Nuisance Attribute Projection (NAP). NAP was proposed to suppress undesirable influences of high channel variabilities between several sessions of a speaker. Next, recent methods focusing on the extraction of so called i-vectors (low dimensional representations of GMM based SVs) are discussed. The space in which i-vectors lie is denoted the Total Variability Space (TVS) since it contains both between-speaker and session/channel variabilities. Once i-vectors have been extracted a Probabilistic Linear Discriminant Analysis (PLDA) model is trained in the TVS. In the training phase of PLDA the TVS is decomposed to a channel and a speaker subspace, hence each i-vector is supposed to be c
Název v anglickém jazyce
On Complementarity of State-of-the-art Speaker Recognition Systems
Popis výsledku anglicky
In this paper recent methods used in the task of Speaker Recognition (SR) are reviewed and their complementarity is analysed. At first, methods based on Supervectors (SVs) related to Gaussian Mixture Models (GMMs) and Support Vector Machines (SVMs) usedas a discriminative model are described along with the Nuisance Attribute Projection (NAP). NAP was proposed to suppress undesirable influences of high channel variabilities between several sessions of a speaker. Next, recent methods focusing on the extraction of so called i-vectors (low dimensional representations of GMM based SVs) are discussed. The space in which i-vectors lie is denoted the Total Variability Space (TVS) since it contains both between-speaker and session/channel variabilities. Once i-vectors have been extracted a Probabilistic Linear Discriminant Analysis (PLDA) model is trained in the TVS. In the training phase of PLDA the TVS is decomposed to a channel and a speaker subspace, hence each i-vector is supposed to be c
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
JD - Využití počítačů, robotika a její aplikace
OECD FORD obor
—
Návaznosti výsledku
Projekt
<a href="/cs/project/GBP103%2F12%2FG084" target="_blank" >GBP103/12/G084: Centrum pro multi-modální interpretaci dat velkého rozsahu</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2012
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
IEEE International Symposium on Signal Processing and Information Technology
ISBN
978-1-4673-5604-6
ISSN
—
e-ISSN
—
Počet stran výsledku
6
Strana od-do
1-6
Název nakladatele
Institute of Electrical and Electronics Engineers ( IEEE )
Místo vydání
Neuveden
Místo konání akce
Vietnam, Ho Chi Minh City
Datum konání akce
12. 12. 2012
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—