Automatická segmentace mluvené řeči do větných jednotek

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F08%3A00500610" target="_blank" >RIV/49777513:23520/08:00500610 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Automatic Segmentation of Speech into Sentence-like Units
Popis výsledku v původním jazyce
This thesis deals with the problem of automatic segmentation of speech recognition output into sentence-like units. The work is focused on two languages - English and Czech. First, I describe creation of two Czech speech corpora with structural metadataannotation in two different domains: broadcast news and broadcast conversations. The main goal of this work is to develop automatic systems for dialog act segmentation of English multiparty meetings and sentence unit segmentation of the two new Czech corpora. I use and compare three modeling approaches - hidden Markov models, maximum entropy, and a boosting-based algorithm called BoosTexter. All of these approaches rely on two information sources - recognized words and prosody. In addition, I explore speaker adaptation for this task. The results indicate that superior performance is achieved when the three statistical models are combined via posterior probability interpolation.
Název v anglickém jazyce
Automatic Segmentation of Speech into Sentence-like Units
Popis výsledku anglicky
This thesis deals with the problem of automatic segmentation of speech recognition output into sentence-like units. The work is focused on two languages - English and Czech. First, I describe creation of two Czech speech corpora with structural metadataannotation in two different domains: broadcast news and broadcast conversations. The main goal of this work is to develop automatic systems for dialog act segmentation of English multiparty meetings and sentence unit segmentation of the two new Czech corpora. I use and compare three modeling approaches - hidden Markov models, maximum entropy, and a boosting-based algorithm called BoosTexter. All of these approaches rely on two information sources - recognized words and prosody. In addition, I explore speaker adaptation for this task. The results indicate that superior performance is achieved when the three statistical models are combined via posterior probability interpolation.

Klasifikace

Druh
O - Ostatní výsledky
CEP obor
JD - Využití počítačů, robotika a její aplikace
OECD FORD obor
—

Návaznosti výsledku

Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2008
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Podobné výsledky(10)

GENRE EFFECTS ON AUTOMATIC SENTENCE SEGMENTATION OF SPEECH: A COMPARISON OF BROADCAST NEWS AND BROADCAST CONVERSATIONS Automatic sentence boundary detection in conversational speech: A cross-lingual evaluation on English and Czech Anotace strukturálních metadat v řečových korpusech: Srovnání rozhlasových zpráv a rozhlasových diskuzí

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Automatická segmentace mluvené řeči do větných jednotek

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Podobné výsledky(10)