Text punctuation: An inter-annotator agreement study

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F17%3A00004832" target="_blank" >RIV/46747885:24220/17:00004832 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/00216224:14330/17:00095096
Výsledek na webu
<a href="http://dx.doi.org/10.1007/978-3-319-64206-2_14" target="_blank" >http://dx.doi.org/10.1007/978-3-319-64206-2_14</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-64206-2_14" target="_blank" >10.1007/978-3-319-64206-2_14</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Text punctuation: An inter-annotator agreement study
Popis výsledku v původním jazyce
Spoken language is a phenomenon which is hard to be annotated accurately. One of the most ambiguous tasks is to fill in the punctuation marks into the spoken language transcription. Used punctuation marks are often dependent on how annotators understand the transcription content. This may differ as the spoken language often lacks clear structure (inherent to written language) due to the utterance spontaneity or due to skipping between ideas. Therefore we suspect that filling commas into the spoken language transcription is a very ambiguous task with low inter-annotator agreement (IAA). Low IAA also means that application of Gold Truth (GT) annotations for automatic algorithm evaluation is questionable as already discussed in [7, 8]. In this paper we analyze the IAA within group of annotators and we propose methods to increase it. We also propose and evaluate a refor-mulation of classical GT annotations for cases with multiple annotations available.
Název v anglickém jazyce
Text punctuation: An inter-annotator agreement study
Popis výsledku anglicky
Spoken language is a phenomenon which is hard to be annotated accurately. One of the most ambiguous tasks is to fill in the punctuation marks into the spoken language transcription. Used punctuation marks are often dependent on how annotators understand the transcription content. This may differ as the spoken language often lacks clear structure (inherent to written language) due to the utterance spontaneity or due to skipping between ideas. Therefore we suspect that filling commas into the spoken language transcription is a very ambiguous task with low inter-annotator agreement (IAA). Low IAA also means that application of Gold Truth (GT) annotations for automatic algorithm evaluation is questionable as already discussed in [7, 8]. In this paper we analyze the IAA within group of annotators and we propose methods to increase it. We also propose and evaluate a refor-mulation of classical GT annotations for cases with multiple annotations available.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
20204 - Robotics and automatic control

Návaznosti výsledku

Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 20th International Conference on Text, Speech and Dialogue, TSD 2017
ISBN
9783319642055
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
9
Strana od-do
120-128
Název nakladatele
Springer Verlag
Místo vydání
Německo
Místo konání akce
Praha, Česká Republika
Datum konání akce
1. 1. 2017
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Word confusions in the transcription and recognition of spontaneous Czech Transformer-Based Automatic Punctuation Prediction and Word Casing Reconstruction of the ASR Output Inter-Annotator Agreement on Spontaneous Czech Language

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Text punctuation: An inter-annotator agreement study

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)