Text punctuation: An inter-annotator agreement study
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F17%3A00004832" target="_blank" >RIV/46747885:24220/17:00004832 - isvavai.cz</a>
Alternative codes found
RIV/00216224:14330/17:00095096
Result on the web
<a href="http://dx.doi.org/10.1007/978-3-319-64206-2_14" target="_blank" >http://dx.doi.org/10.1007/978-3-319-64206-2_14</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-64206-2_14" target="_blank" >10.1007/978-3-319-64206-2_14</a>
Alternative languages
Result language
angličtina
Original language name
Text punctuation: An inter-annotator agreement study
Original language description
Spoken language is a phenomenon which is hard to be annotated accurately. One of the most ambiguous tasks is to fill in the punctuation marks into the spoken language transcription. Used punctuation marks are often dependent on how annotators understand the transcription content. This may differ as the spoken language often lacks clear structure (inherent to written language) due to the utterance spontaneity or due to skipping between ideas. Therefore we suspect that filling commas into the spoken language transcription is a very ambiguous task with low inter-annotator agreement (IAA). Low IAA also means that application of Gold Truth (GT) annotations for automatic algorithm evaluation is questionable as already discussed in [7, 8]. In this paper we analyze the IAA within group of annotators and we propose methods to increase it. We also propose and evaluate a refor-mulation of classical GT annotations for cases with multiple annotations available.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
20204 - Robotics and automatic control
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2017
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 20th International Conference on Text, Speech and Dialogue, TSD 2017
ISBN
9783319642055
ISSN
0302-9743
e-ISSN
—
Number of pages
9
Pages from-to
120-128
Publisher name
Springer Verlag
Place of publication
Německo
Event location
Praha, Česká Republika
Event date
Jan 1, 2017
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—