Study on Phrases Used for Semi-automatic Text-based Speakers? Names Extraction in the Czech Radio Broadcasts News

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F14%3A%230003003" target="_blank" >RIV/46747885:24220/14:#0003003 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1007/978-3-319-10816-2_50" target="_blank" >http://dx.doi.org/10.1007/978-3-319-10816-2_50</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-10816-2_50" target="_blank" >10.1007/978-3-319-10816-2_50</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Study on Phrases Used for Semi-automatic Text-based Speakers? Names Extraction in the Czech Radio Broadcasts News
Popis výsledku v původním jazyce
In this paper we introduce a methodology leading to the extension of speakers' database used in the process of automatic transcription of spoken documents stored in the largest Czech Radio audio archive. We address the issue of the conversion of spoken speech to written texts - the automatic detection of speakers and their names. We work with a subset of the archive that consists of 8,020 hours of broadcasting news and 58,914,179 words within the years 1968-2011. We observed the occurrence of thousandsof speakers' names during the period and therefore it is necessary to use their automatic or semi-automatic identification. Another investigated issue leading to the extension of speakers' database is the co-occurrence of a speaker's name in a specific phrase in the text transcription linked with the speaker's change in the audio recording.
Název v anglickém jazyce
Study on Phrases Used for Semi-automatic Text-based Speakers? Names Extraction in the Czech Radio Broadcasts News
Popis výsledku anglicky
In this paper we introduce a methodology leading to the extension of speakers' database used in the process of automatic transcription of spoken documents stored in the largest Czech Radio audio archive. We address the issue of the conversion of spoken speech to written texts - the automatic detection of speakers and their names. We work with a subset of the archive that consists of 8,020 hours of broadcasting news and 58,914,179 words within the years 1968-2011. We observed the occurrence of thousandsof speakers' names during the period and therefore it is necessary to use their automatic or semi-automatic identification. Another investigated issue leading to the extension of speakers' database is the co-occurrence of a speaker's name in a specific phrase in the text transcription linked with the speaker's change in the audio recording.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
JC - Počítačový hardware a software
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/DF11P01OVV013" target="_blank" >DF11P01OVV013: Zpřístupnění archivu Českého rozhlasu pro sofistikované vyhledávání</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2014
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Proc. of 17th International Conference, TSD 2014
ISBN
9783319108155
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
8
Strana od-do
416-423
Název nakladatele
Springer-Verlag Berlin Heidelberg
Místo vydání
Berlín, Spolková republika Německo
Místo konání akce
Brno, Česká Republika
Datum konání akce
1. 1. 2014
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Discretion of Speech Units for the Text Post-processing Phase of Automatic Transcription (in the Czech Language)Post-processing of the Recognized Speech for Web Presentation of Large Audio Archive Použití interpunkce v automatických přepisech mluveného slova

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Study on Phrases Used for Semi-automatic Text-based Speakers? Names Extraction in the Czech Radio Broadcasts News

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)