Fitting a Square Peg into a Round Hole: Creating a UniMorph dataset of Kanien'kéha Verbs
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3AWISYF7HX" target="_blank" >RIV/00216208:11320/25:WISYF7HX - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85189622417&partnerID=40&md5=7d9f3cb5f01d9cd5c5353c5469d2e682" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85189622417&partnerID=40&md5=7d9f3cb5f01d9cd5c5353c5469d2e682</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Fitting a Square Peg into a Round Hole: Creating a UniMorph dataset of Kanien'kéha Verbs
Original language description
This paper describes efforts to annotate a dataset of verbs in the Iroquoian language Kanien'kéha (a.k.a. Mohawk) using the UniMorph schema (Batsuren et al., 2022a). The dataset is based on the output of a symbolic model - a hand-built verb conjugator. Morphological constituents of each verb are automatically annotated with UniMorph tags. Overall the process was smooth but some central features of the language did not fall neatly into the schema which resulted in a large number of custom tags and a somewhat ad hoc mapping process. We think the same difficulties are likely to arise for other Iroquoian languages and perhaps other North American language families. This paper describes our decision making process with respect to Kanien'kéha and reports preliminary results of morphological induction experiments using the dataset. © 2024 Association for Computational Linguistics.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
ComputEL - Workshop Use Comput. Methods Study Endanger. Lang., Proc. Workshop
ISBN
979-889176086-8
ISSN
—
e-ISSN
—
Number of pages
13
Pages from-to
39-51
Publisher name
Association for Computational Linguistics (ACL)
Place of publication
—
Event location
Hybrid, St. Julian'
Event date
Jan 1, 2025
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—