Text Structure and Its Ambiguities: Corpus Annotation as a Helpful Guide
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F24%3A10492920" target="_blank" >RIV/00216208:11320/24:10492920 - isvavai.cz</a>
Výsledek na webu
<a href="https://ceur-ws.org/Vol-3792/invited2.pdf" target="_blank" >https://ceur-ws.org/Vol-3792/invited2.pdf</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Text Structure and Its Ambiguities: Corpus Annotation as a Helpful Guide
Popis výsledku v původním jazyce
It is typical for natural languages that their texts can be understood differently by individual recipients. A number of scientific disciplines, from cognitive psychology to linguistics, are devoted to this phenomenon. In this study, we focus mainly on linguistic factors, which may lead to different interpretations of coherence relations in the text (simply speaking, what is related to what and how). This work presents a pilot typological survey of disagreements in Czech corpus annotations of coherence relations (discourse relations, coreference, information structure) and their common features. Polysemy (polyfunctionality) and semantic underspecification of coherent expressions (e.g. discourse connectives), generic / abstract meaning of autosemantic words, presence of attribution constructions, word order as a potential marker of information structure and text size appear to be essential factors for disagreement in interpretation. In addition, subjective reception of the relative importance of differ
Název v anglickém jazyce
Text Structure and Its Ambiguities: Corpus Annotation as a Helpful Guide
Popis výsledku anglicky
It is typical for natural languages that their texts can be understood differently by individual recipients. A number of scientific disciplines, from cognitive psychology to linguistics, are devoted to this phenomenon. In this study, we focus mainly on linguistic factors, which may lead to different interpretations of coherence relations in the text (simply speaking, what is related to what and how). This work presents a pilot typological survey of disagreements in Czech corpus annotations of coherence relations (discourse relations, coreference, information structure) and their common features. Polysemy (polyfunctionality) and semantic underspecification of coherent expressions (e.g. discourse connectives), generic / abstract meaning of autosemantic words, presence of attribution constructions, word order as a potential marker of information structure and text size appear to be essential factors for disagreement in interpretation. In addition, subjective reception of the relative importance of differ
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/GA24-11132S" target="_blank" >GA24-11132S: Neshoda v korpusové anotaci ve vztahu k víceznačnosti textu</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of the 24th Conference Information Technologies – Applications and Theory (ITAT 2024)
ISBN
—
ISSN
1613-0073
e-ISSN
—
Počet stran výsledku
11
Strana od-do
2-12
Název nakladatele
CEUR-WS.org
Místo vydání
Košice, Slovakia
Místo konání akce
Drienica, Slovakia
Datum konání akce
20. 9. 2024
Typ akce podle státní příslušnosti
CST - Celostátní akce
Kód UT WoS článku
—