Building a Morphological Network for Persian on Top of a Morpheme-Segmented Lexicon
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F19%3A10405566" target="_blank" >RIV/00216208:11320/19:10405566 - isvavai.cz</a>
Result on the web
<a href="https://ufal.mff.cuni.cz/derimo2019/pdf-files/derimo2019.pdf" target="_blank" >https://ufal.mff.cuni.cz/derimo2019/pdf-files/derimo2019.pdf</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Building a Morphological Network for Persian on Top of a Morpheme-Segmented Lexicon
Original language description
In this work, we introduce a new large hand-annotated morpheme-segmentation lexicon of Persian words and present an algorithm that builds a morphological network using this segmented lexicon. The resulting network captures both derivational and inflectional relations. The algorithm for inducing the network approximates the distinction between root morphemes and affixes using the number of morpheme occurrences in the lexicon. We evaluate the quality (in the sense of linguistic correctness) of the resulting network empirically and compare it to the quality of a network generated in a setup based on manually distinguished non-root morphemes. In the second phase of this work, we evaluated various strategies to add new words (unprocessed in the segmented lexicon) into an existing morphological network automatically. For this purpose, we created primary morphological networks based on two initial data: a manually segmented lexicon and an automatically segmented lexicon created by unsupervised MORFESSOR. The
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the Second International Workshop on Resources and Tools for Derivational Morphology (DeriMo 2019)
ISBN
978-80-88132-08-0
ISSN
—
e-ISSN
—
Number of pages
10
Pages from-to
91-100
Publisher name
ÚFAL MFF UK
Place of publication
Praha, Czechia
Event location
Praha, Czechia
Event date
Sep 19, 2019
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—