Recurrent Neural Network Based Speaker Change Detection from Text Transcription Applied in Telephone Speaker Diarization System
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F18%3A43952594" target="_blank" >RIV/49777513:23520/18:43952594 - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007/978-3-030-00794-2_37" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-030-00794-2_37</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-030-00794-2_37" target="_blank" >10.1007/978-3-030-00794-2_37</a>
Alternative languages
Result language
angličtina
Original language name
Recurrent Neural Network Based Speaker Change Detection from Text Transcription Applied in Telephone Speaker Diarization System
Original language description
In this paper, we propose a speaker change detection system based on lexical information from the transcribed speech. For this purpose, we applied a recurrent neural network to decide if there is an end of an utterance at the end of a spoken word. Our motivation is to use the transcription of the conversation as an additional feature for a speaker diarization system to refine the segmentation step to achieve better accuracy of the whole diarization system. We compare the proposed speaker change detection system based on transcription (text) with our previous system based on information from spectrogram (audio) and combine these two modalities to improve the results of diarization. We cut the conversation into segments according to the detected changes and represent them by an i-vector. We conducted experiments on the English part of the CallHome corpus. The results indicate improvement in speaker change detection (by 0.5% relatively) and also in speaker diarization (by 1% relatively) when both modalities are used.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
20205 - Automation and control systems
Result continuities
Project
<a href="/en/project/DG16P02B009" target="_blank" >DG16P02B009: Access to a Lingustically Structured Database of Enquiries from the Language Consulting Centre</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2018
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Text, Speech, and Dialogue 21st International Conference, TSD 2018, Brno, Czech Republic, September 11-14, 2018, Proceedings
ISBN
978-3-030-00793-5
ISSN
0302-9743
e-ISSN
1611-3349
Number of pages
9
Pages from-to
342-350
Publisher name
Springer Nature Switzerland AG
Place of publication
Cham
Event location
Brno, Czech Republic
Event date
Sep 11, 2018
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—