Latin Morphology through the Centuries: Ensuring Consistency for Better Language Processing
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3A10475855" target="_blank" >RIV/00216208:11320/23:10475855 - isvavai.cz</a>
Result on the web
<a href="https://aclanthology.org/2023.alp-1.7.pdf" target="_blank" >https://aclanthology.org/2023.alp-1.7.pdf</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.5281/zenodo.8337364" target="_blank" >10.5281/zenodo.8337364</a>
Alternative languages
Result language
angličtina
Original language name
Latin Morphology through the Centuries: Ensuring Consistency for Better Language Processing
Original language description
This paper focuses on the process of harmonising the five Latin treebanks available in Universal Dependencies with respect to morphological annotation. We propose a workflow that allows to first spot inconsistencies and missing information, in order to detect to what extent the annotations differ, and then correct the retrieved bugs, with the goal of equalising the annotation of morphological features in the treebanks and producing more consistent linguistic data. Subsequently, we present some experiments carried out with UDPipe and Stanza in order to assess the impact of such harmonisation on parsing accuracy.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/GX20-16819X" target="_blank" >GX20-16819X: Language Understanding: from Syntax to Discourse</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the Ancient Language Processing Workshop
ISBN
978-954-452-087-8
ISSN
—
e-ISSN
—
Number of pages
9
Pages from-to
59-67
Publisher name
INCOMA
Place of publication
Varna, Bulgaria
Event location
Varna, Bulgaria
Event date
Sep 8, 2023
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—