Morphological Tagging in Bribri Using Universal Dependency Features
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3AQILZVPIU" target="_blank" >RIV/00216208:11320/25:QILZVPIU - isvavai.cz</a>
Výsledek na webu
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85216933656&doi=10.18653%2fv1%2f2024.americasnlp-1.8&partnerID=40&md5=b9f65fc65498406d8951b5a6404dc50e" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85216933656&doi=10.18653%2fv1%2f2024.americasnlp-1.8&partnerID=40&md5=b9f65fc65498406d8951b5a6404dc50e</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.18653/v1/2024.americasnlp-1.8" target="_blank" >10.18653/v1/2024.americasnlp-1.8</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Morphological Tagging in Bribri Using Universal Dependency Features
Popis výsledku v původním jazyce
This paper outlines the Universal Features tagging of a dependency treebank for Bribri, an Indigenous language of Costa Rica. Universal Features are a morphosyntactic tagging component of Universal Dependencies, which is a framework that aims to provide an annotation system inclusive of all languages and their diverse structures (Nivre et al., 2016; de Marneffe et al., 2021). We used a rule-based system to do a first-pass tagging of a treebank of 1572 words. After manual corrections, the treebank contained 3051 morphological features. We then used this morphologically-tagged treebank to train a UDPipe 2 parsing and tagging model. This model has a UFEATS precision of 80.5 ± 3.6, which is a statistically significant improvement upon the previously available FOMA-based morphological tagger for Bribri. An error analysis suggests that missing TAM and case markers are the most common problem for the model. We hope to use this model to expand upon existing treebanks and facilitate the construction of linguistically-annotated corpora for the language. © 2024 Association for Computational Linguistics.
Název v anglickém jazyce
Morphological Tagging in Bribri Using Universal Dependency Features
Popis výsledku anglicky
This paper outlines the Universal Features tagging of a dependency treebank for Bribri, an Indigenous language of Costa Rica. Universal Features are a morphosyntactic tagging component of Universal Dependencies, which is a framework that aims to provide an annotation system inclusive of all languages and their diverse structures (Nivre et al., 2016; de Marneffe et al., 2021). We used a rule-based system to do a first-pass tagging of a treebank of 1572 words. After manual corrections, the treebank contained 3051 morphological features. We then used this morphologically-tagged treebank to train a UDPipe 2 parsing and tagging model. This model has a UFEATS precision of 80.5 ± 3.6, which is a statistically significant improvement upon the previously available FOMA-based morphological tagger for Bribri. An error analysis suggests that missing TAM and case markers are the most common problem for the model. We hope to use this model to expand upon existing treebanks and facilitate the construction of linguistically-annotated corpora for the language. © 2024 Association for Computational Linguistics.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
AmericasNLP - Workshop Nat. Lang. Process. Indig. Lang. Am. - Proc. Workshop
ISBN
979-889176108-7
ISSN
—
e-ISSN
—
Počet stran výsledku
11
Strana od-do
56-66
Název nakladatele
Association for Computational Linguistics (ACL)
Místo vydání
—
Místo konání akce
Mexico City
Datum konání akce
1. 1. 2025
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—