Incorporation of the ASR output in speaker segmentation and clustering within the task of speaker diarization of broadcast streams

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F12%3A%230002003" target="_blank" >RIV/46747885:24220/12:#0002003 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Incorporation of the ASR output in speaker segmentation and clustering within the task of speaker diarization of broadcast streams
Popis výsledku v původním jazyce
In this paper we study the effect of incorporation of automatic transcriptions in the speaker diarization process. We aim to improve both the diarization accuracy as evaluated by standard objective measures and quality of the diarization output from user?s perspective. Although the presented approach relies on output of an automatic speech recognizer, it makes no use of lexical information. Instead, we use information about word boundaries and classification of non-speech events occurring in the processed stream. The former information is used as constraining condition for speaker change-point candidates and the latter facilitate to neglect various vocal noise sounds that carry no speaker-specific information (considering representation of the signal by cepstral features) and thus harm the speaker?s representation. The experimental evaluation of the presented approach was carried out using the COST278 multilingual broadcast news database. We demonstrate that the approach yields improve
Název v anglickém jazyce
Incorporation of the ASR output in speaker segmentation and clustering within the task of speaker diarization of broadcast streams
Popis výsledku anglicky
In this paper we study the effect of incorporation of automatic transcriptions in the speaker diarization process. We aim to improve both the diarization accuracy as evaluated by standard objective measures and quality of the diarization output from user?s perspective. Although the presented approach relies on output of an automatic speech recognizer, it makes no use of lexical information. Instead, we use information about word boundaries and classification of non-speech events occurring in the processed stream. The former information is used as constraining condition for speaker change-point candidates and the latter facilitate to neglect various vocal noise sounds that carry no speaker-specific information (considering representation of the signal by cepstral features) and thus harm the speaker?s representation. The experimental evaluation of the presented approach was carried out using the COST278 multilingual broadcast news database. We demonstrate that the approach yields improve

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
JC - Počítačový hardware a software
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/TA01011204" target="_blank" >TA01011204: Živé archivy</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2012
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Proc. of IEEE conf. on Multimedia Signal Processing (MMSP)
ISBN
978-1-4673-4572-9
ISSN
—
e-ISSN
—
Počet stran výsledku
6
Strana od-do
118-123
Název nakladatele
—
Místo vydání
Kanada
Místo konání akce
Banff, Kanada
Datum konání akce
1. 1. 2012
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000312670200021

Podobné výsledky(10)

Speaker Diarization of Broadcast Audio Using Automatic Transcription, iVectors and Cosine Distance Scoring Online Speaker Diarization Study on Integration of Speaker Diarization with Speaker Adaptive Speech Recognition for Broadcast Transcription

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Incorporation of the ASR output in speaker segmentation and clustering within the task of speaker diarization of broadcast streams

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)