Unsupervised Extraction of Morphological Categories for Morphemes
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F24%3A10492904" target="_blank" >RIV/00216208:11320/24:10492904 - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007/978-3-031-70563-2_19" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-031-70563-2_19</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-031-70563-2_19" target="_blank" >10.1007/978-3-031-70563-2_19</a>
Alternative languages
Result language
angličtina
Original language name
Unsupervised Extraction of Morphological Categories for Morphemes
Original language description
Words in natural language can be assigned to specific morphological categories. For example, the English word 'apples' can be described using morphological labels like N;PL. The conditional probabilities on such word forms given the labels would reveal for English that the morpheme 's' is present almost always when the label N;PL appears. This indicates that the morphological properties of a word can be traced to its morphemes. We do not have any data resource that associates morphemes with morphological categories. We use UniMorph schema and datasets for universal morphological annotation as a source of morphological categories and morpheme segmentation. We align morphemes (or exponents) with the corresponding morphological categories based on the UniMorph schema for 12 languages. Given the multilingual nature of the task, we utilize unsupervised methods based on the INCREMENT P measure and IBM Models as we test out the effectiveness of alignment methods used in statistical machine translation. Our results in
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
27th International Conference on Text, Speech and Dialogue
ISBN
978-3-031-70563-2
ISSN
—
e-ISSN
—
Number of pages
13
Pages from-to
239-251
Publisher name
Springer
Place of publication
Cham, Switzerland
Event location
Brno, Czechia
Event date
Sep 11, 2024
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—