Data Conversion and Consistency of Monolingual Corpora: Russian UD Treebanks
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F18%3A10390144" target="_blank" >RIV/00216208:11320/18:10390144 - isvavai.cz</a>
Result on the web
<a href="http://www.ep.liu.se/ecp/155/ecp18155.pdf" target="_blank" >http://www.ep.liu.se/ecp/155/ecp18155.pdf</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Data Conversion and Consistency of Monolingual Corpora: Russian UD Treebanks
Original language description
In this paper we focus on syntactic annotation consistency within Universal Dependencies (UD) treebanks for Russian: UD_Russian-SynTagRus, UD_Russian-GSD, UD_Russian-Taiga, and UD_Russian-PUD. We describe the four treebanks, their distinctive features and development. In order to test and improve consistency within the treebanks, we reconsidered the experiments by Martínez Alonso and Zeman; our parsing experiments were conducted using a state-of-the-art parser that took part in the CoNLL 2017 Shared Task. We analyze error classes in functional and content relations and discuss a method to separate the errors induced by annotation inconsistency and those caused by syntactic complexity and other factors.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/GA15-10472S" target="_blank" >GA15-10472S: Morphologically and Syntactically Annotated Corpora of Many Languages</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2018
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the 17th International Workshop on Treebanks and Linguistic Theories (TLT 2018)
ISBN
978-91-7685-137-1
ISSN
1650-3740
e-ISSN
neuvedeno
Number of pages
14
Pages from-to
53-66
Publisher name
Linköping University Electronic Press
Place of publication
Linköping, Sweden
Event location
Oslo, Norway
Event date
Dec 13, 2018
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—