Speech-to-text summarization using automatic phrase extraction from recognized text
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F16%3A00000468" target="_blank" >RIV/46747885:24220/16:00000468 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1007/978-3-319-45510-5_12" target="_blank" >http://dx.doi.org/10.1007/978-3-319-45510-5_12</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-45510-5_12" target="_blank" >10.1007/978-3-319-45510-5_12</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Speech-to-text summarization using automatic phrase extraction from recognized text
Popis výsledku v původním jazyce
This paper describes a summarization system that was developed in order to summarize news delivered orally. The system generates text summaries from input audio using three independent components: an automatic speech recognizer, a syntactic analyzer, and a summarizer. The absence of sentence boundaries in the recognized text complicates the summarization process. Therefore, we use a syntactic analyzer to identify continuous segments in the recognized text.We used 50 reference articles to perform our evaluation. The data are publicly available at http://nlp.ite.tul.cz/sumarizace. The results of the proposed system were compared with the results of sentence summarization in the reference articles. The evaluation was performed using co-occurrence of n-grams in the reference and generated summaries, and by readers mark-ups. The readers marked two aspects of the summaries: readability and information relevance. Experiments confirm that the generated summaries have the same information value as the reference summaries. However, readers state that phrase summaries are hard to read without the whole sentence context.
Název v anglickém jazyce
Speech-to-text summarization using automatic phrase extraction from recognized text
Popis výsledku anglicky
This paper describes a summarization system that was developed in order to summarize news delivered orally. The system generates text summaries from input audio using three independent components: an automatic speech recognizer, a syntactic analyzer, and a summarizer. The absence of sentence boundaries in the recognized text complicates the summarization process. Therefore, we use a syntactic analyzer to identify continuous segments in the recognized text.We used 50 reference articles to perform our evaluation. The data are publicly available at http://nlp.ite.tul.cz/sumarizace. The results of the proposed system were compared with the results of sentence summarization in the reference articles. The evaluation was performed using co-occurrence of n-grams in the reference and generated summaries, and by readers mark-ups. The readers marked two aspects of the summaries: readability and information relevance. Experiments confirm that the generated summaries have the same information value as the reference summaries. However, readers state that phrase summaries are hard to read without the whole sentence context.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
JC - Počítačový hardware a software
OECD FORD obor
—
Návaznosti výsledku
Projekt
<a href="/cs/project/TA04010199" target="_blank" >TA04010199: MULTILINMEDIA - Multilinguální platforma pro monitoring a analýzu multimédií</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2016
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISBN
978-3-319-45509-9
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
8
Strana od-do
101-108
Název nakladatele
Springer International Publishing
Místo vydání
Switzerland
Místo konání akce
Brno, Česká Republika
Datum konání akce
1. 1. 2016
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000389707400012