Speech-to-text summarization using automatic phrase extraction from recognized text

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F16%3A00000468" target="_blank" >RIV/46747885:24220/16:00000468 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1007/978-3-319-45510-5_12" target="_blank" >http://dx.doi.org/10.1007/978-3-319-45510-5_12</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-45510-5_12" target="_blank" >10.1007/978-3-319-45510-5_12</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Speech-to-text summarization using automatic phrase extraction from recognized text
Popis výsledku v původním jazyce
This paper describes a summarization system that was developed in order to summarize news delivered orally. The system generates text summaries from input audio using three independent components: an automatic speech recognizer, a syntactic analyzer, and a summarizer. The absence of sentence boundaries in the recognized text complicates the summarization process. Therefore, we use a syntactic analyzer to identify continuous segments in the recognized text.We used 50 reference articles to perform our evaluation. The data are publicly available at http://nlp.ite.tul.cz/sumarizace. The results of the proposed system were compared with the results of sentence summarization in the reference articles. The evaluation was performed using co-occurrence of n-grams in the reference and generated summaries, and by readers mark-ups. The readers marked two aspects of the summaries: readability and information relevance. Experiments confirm that the generated summaries have the same information value as the reference summaries. However, readers state that phrase summaries are hard to read without the whole sentence context.
Název v anglickém jazyce
Speech-to-text summarization using automatic phrase extraction from recognized text
Popis výsledku anglicky
This paper describes a summarization system that was developed in order to summarize news delivered orally. The system generates text summaries from input audio using three independent components: an automatic speech recognizer, a syntactic analyzer, and a summarizer. The absence of sentence boundaries in the recognized text complicates the summarization process. Therefore, we use a syntactic analyzer to identify continuous segments in the recognized text.We used 50 reference articles to perform our evaluation. The data are publicly available at http://nlp.ite.tul.cz/sumarizace. The results of the proposed system were compared with the results of sentence summarization in the reference articles. The evaluation was performed using co-occurrence of n-grams in the reference and generated summaries, and by readers mark-ups. The readers marked two aspects of the summaries: readability and information relevance. Experiments confirm that the generated summaries have the same information value as the reference summaries. However, readers state that phrase summaries are hard to read without the whole sentence context.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
JC - Počítačový hardware a software
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/TA04010199" target="_blank" >TA04010199: MULTILINMEDIA - Multilinguální platforma pro monitoring a analýzu multimédií</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2016
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISBN
978-3-319-45509-9
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
8
Strana od-do
101-108
Název nakladatele
Springer International Publishing
Místo vydání
Switzerland
Místo konání akce
Brno, Česká Republika
Datum konání akce
1. 1. 2016
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000389707400012

Podobné výsledky(10)

Analiza besedil pisatelja Karla Čapka z vidika jezikoslovja Several Aspects of Machine-Driven Phrasing in Text-to-Speech Systems Evaluation measures for text summarization

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Speech-to-text summarization using automatic phrase extraction from recognized text

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)