Neural Network Speaker Descriptor in Speaker Diarization of Telephone Speech
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F17%3A43932647" target="_blank" >RIV/49777513:23520/17:43932647 - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007%2F978-3-319-66429-3_55" target="_blank" >https://link.springer.com/chapter/10.1007%2F978-3-319-66429-3_55</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-66429-3_55" target="_blank" >10.1007/978-3-319-66429-3_55</a>
Alternative languages
Result language
angličtina
Original language name
Neural Network Speaker Descriptor in Speaker Diarization of Telephone Speech
Original language description
In this paper, we have been investigating an approach to a speaker representation for a diarization system that clusters short telephone conversation segments (produced by the same speaker). The proposed approach applies a neural-network-based descriptor that replaces a usual i-vector descriptor in the state-of-the-art diarization systems. The comparison of these two techniques was done on the English part of the CallHome corpus. The final results indicate the superiority of the i-vector's approach although our proposed descriptor brings an additive information. Thus, the combined descriptor represents a speaker in a segment for diarization purpose with lower diarization error (almost 20% relative improvement compared with only i-vector application).
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
20205 - Automation and control systems
Result continuities
Project
<a href="/en/project/DG16P02B048" target="_blank" >DG16P02B048: System for permanent preservation of documentation and presentation of historical sources from the period of totalitarian regimes</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2017
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Speech and Computer 19th International Conference, SPECOM 2017, Hatfield, UK, September 12-16, 2017, Proceedings
ISBN
978-3-319-66428-6
ISSN
0302-9743
e-ISSN
neuvedeno
Number of pages
9
Pages from-to
555-563
Publisher name
Springer
Place of publication
Cham
Event location
Hatfield, Hertfordshire, United Kingdom
Event date
Sep 12, 2017
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—