Speaker Diarization Using Convolutional Neural Network for Statistics Accumulation Refinement
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F17%3A43932657" target="_blank" >RIV/49777513:23520/17:43932657 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.21437/Interspeech.2017-51" target="_blank" >http://dx.doi.org/10.21437/Interspeech.2017-51</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.21437/Interspeech.2017-51" target="_blank" >10.21437/Interspeech.2017-51</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Speaker Diarization Using Convolutional Neural Network for Statistics Accumulation Refinement
Popis výsledku v původním jazyce
The aim of this paper is to investigate the benefit of information from a speaker change detection system based on Convolutional Neural Network (CNN) when applied to the process of accumu- lation of statistics for an i-vector generation. The investigation is carried out on the problem of diarization. In our system, the output of the CNN is a probability value of a speaker change in a conversation for a given time segment. According to this probability, we cut the conversation into short segments that are then represented by the i-vector (to describe a speaker in it). We propose a technique to utilize the information from the CNN for the weighting of the acoustic data in a segment to refine the statistics accumulation process. This technique enables us to represent the speaker better in the final i-vector. The experi- ments on the English part of the CallHome corpus show that our proposed refinement of the statistics accumulation is beneficial with the relative improvement of Diarization Error Rate almost by 16 % when compared to the speaker diarization system with- out statistics refinement.
Název v anglickém jazyce
Speaker Diarization Using Convolutional Neural Network for Statistics Accumulation Refinement
Popis výsledku anglicky
The aim of this paper is to investigate the benefit of information from a speaker change detection system based on Convolutional Neural Network (CNN) when applied to the process of accumu- lation of statistics for an i-vector generation. The investigation is carried out on the problem of diarization. In our system, the output of the CNN is a probability value of a speaker change in a conversation for a given time segment. According to this probability, we cut the conversation into short segments that are then represented by the i-vector (to describe a speaker in it). We propose a technique to utilize the information from the CNN for the weighting of the acoustic data in a segment to refine the statistics accumulation process. This technique enables us to represent the speaker better in the final i-vector. The experi- ments on the English part of the CallHome corpus show that our proposed refinement of the statistics accumulation is beneficial with the relative improvement of Diarization Error Rate almost by 16 % when compared to the speaker diarization system with- out statistics refinement.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
20205 - Automation and control systems
Návaznosti výsledku
Projekt
<a href="/cs/project/LO1506" target="_blank" >LO1506: Podpora udržitelnosti centra NTIS - Nové technologie pro informační společnost</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of the 18th Annual Conference of the International Speech Communication Association (Interspeech 2017)
ISBN
978-1-5108-4876-4
ISSN
—
e-ISSN
—
Počet stran výsledku
5
Strana od-do
3562-3566
Název nakladatele
Curran Associates, Inc.
Místo vydání
Red Hook, NY
Místo konání akce
Stockholm, Sweden
Datum konání akce
20. 8. 2017
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000457505000741