Morphological analysis and disambiguation for Breton
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F21%3A10441632" target="_blank" >RIV/00216208:11320/21:10441632 - isvavai.cz</a>
Výsledek na webu
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=Ihw-0KwxJ0" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=Ihw-0KwxJ0</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10579-020-09510-8" target="_blank" >10.1007/s10579-020-09510-8</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Morphological analysis and disambiguation for Breton
Popis výsledku v původním jazyce
In this paper we present an extended description of two resources for natural language processing of Breton, a morphological analyser and constraint grammar-based disambiguator. The constraint grammar was developed using a novel methodology by a linguist and a language consultant creating rules to solve specific errors in disambiguation in a machine translation system. In addition we introduce a new morphologically-disambiguated corpus of Breton and evaluate both the morphological analyser and constraint grammar for coverage and accuracy. For comparison we use the same corpus to train several reference systems for part-of-speech tagging and lemmatisation and compare the performance. The experiments show that our system outperforms the reference systems by a wide margin when the reference systems are trained without an external full-form list, and performs comparably when they are trained with a full-form list generated from our morphological analyser.
Název v anglickém jazyce
Morphological analysis and disambiguation for Breton
Popis výsledku anglicky
In this paper we present an extended description of two resources for natural language processing of Breton, a morphological analyser and constraint grammar-based disambiguator. The constraint grammar was developed using a novel methodology by a linguist and a language consultant creating rules to solve specific errors in disambiguation in a machine translation system. In addition we introduce a new morphologically-disambiguated corpus of Breton and evaluate both the morphological analyser and constraint grammar for coverage and accuracy. For comparison we use the same corpus to train several reference systems for part-of-speech tagging and lemmatisation and compare the performance. The experiments show that our system outperforms the reference systems by a wide margin when the reference systems are trained without an external full-form list, and performs comparably when they are trained with a full-form list generated from our morphological analyser.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Language Resources and Evaluation
ISSN
1574-020X
e-ISSN
1574-0218
Svazek periodika
55
Číslo periodika v rámci svazku
2
Stát vydavatele periodika
NL - Nizozemsko
Počet stran výsledku
43
Strana od-do
431-473
Kód UT WoS článku
000590025300001
EID výsledku v databázi Scopus
2-s2.0-85096077940