Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F22%3APU145980" target="_blank" >RIV/00216305:26230/22:PU145980 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.isca-speech.org/archive/pdfs/interspeech_2022/brummer22_interspeech.pdf" target="_blank" >https://www.isca-speech.org/archive/pdfs/interspeech_2022/brummer22_interspeech.pdf</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.21437/Interspeech.2022-731" target="_blank" >10.21437/Interspeech.2022-731</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings
Popis výsledku v původním jazyce
In speaker recognition, where speech segments are mapped to embeddings on the unit hypersphere, two scoring backends are commonly used, namely cosine scoring or PLDA. Both have advantages and disadvantages, depending on the context. Cosine scoring follows naturally from the spherical geometry, but for PLDA the blessing is mixedlength normalization Gaussianizes the between-speaker distribution, but violates the assumption of a speaker-independent within-speaker distribution. We propose PSDA, an analogue to PLDA that uses Von Mises- Fisher distributions on the hypersphere for both within and between-class distributions. We show how the self-conjugacy of this distribution gives closed-form likelihood-ratio scores, making it a drop-in replacement for PLDA at scoring time. All kinds of trials can be scored, including single-enroll and multienroll verification, as well as more complex likelihood-ratios that could be used in clustering and diarization. Learning is done via an EM-algorithm with closed-form updates. We explain the model and present some first experiments.
Název v anglickém jazyce
Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings
Popis výsledku anglicky
In speaker recognition, where speech segments are mapped to embeddings on the unit hypersphere, two scoring backends are commonly used, namely cosine scoring or PLDA. Both have advantages and disadvantages, depending on the context. Cosine scoring follows naturally from the spherical geometry, but for PLDA the blessing is mixedlength normalization Gaussianizes the between-speaker distribution, but violates the assumption of a speaker-independent within-speaker distribution. We propose PSDA, an analogue to PLDA that uses Von Mises- Fisher distributions on the hypersphere for both within and between-class distributions. We show how the self-conjugacy of this distribution gives closed-form likelihood-ratio scores, making it a drop-in replacement for PLDA at scoring time. All kinds of trials can be scored, including single-enroll and multienroll verification, as well as more complex likelihood-ratios that could be used in clustering and diarization. Learning is done via an EM-algorithm with closed-form updates. We explain the model and present some first experiments.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
ISBN
—
ISSN
1990-9772
e-ISSN
—
Počet stran výsledku
5
Strana od-do
1446-1450
Název nakladatele
International Speech Communication Association
Místo vydání
Incheon
Místo konání akce
Incheon Korea
Datum konání akce
18. 9. 2022
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—