Global Variants in the Czech Language
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F22%3A10456999" target="_blank" >RIV/00216208:11320/22:10456999 - isvavai.cz</a>
Výsledek na webu
<a href="https://ics.upjs.sk/~antoni/ceur-ws.org/Vol-0000/paper14.pdf" target="_blank" >https://ics.upjs.sk/~antoni/ceur-ws.org/Vol-0000/paper14.pdf</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Global Variants in the Czech Language
Popis výsledku v původním jazyce
There are words written in several different ways in Czech, e.g., lampion TILDE OPERATOR+D91 lampión (lampion). This variability may occur in either some inflectional word- forms (inflectional variants), cf. hradu TILDE OPERATOR+D91 hradě in the locative case of the noun hrad (castle), or across the inflectional wordforms and derivatives (global variants), cf. fantazijní TILDE OPERATOR+D91 fantasijní in the adjective derived from the noun fantazie TILDE OPERATOR+D91 fantasie (fantasy). It is reasonable to distinguish the global variants as different words but to have formal means that interconnect them in the Natural Language Processing systems and resources. In this paper, we describe the identification of global variants in the Czech vocabulary and summarise new changes in the MorfFlex CZ dictionary and DeriNet lexicon concerning this type of variants. We reviewed several typical patterns within global variants captured in the available resources and combined a set of regular expressions with manual annotations to achieve the highest precision of the identification.
Název v anglickém jazyce
Global Variants in the Czech Language
Popis výsledku anglicky
There are words written in several different ways in Czech, e.g., lampion TILDE OPERATOR+D91 lampión (lampion). This variability may occur in either some inflectional word- forms (inflectional variants), cf. hradu TILDE OPERATOR+D91 hradě in the locative case of the noun hrad (castle), or across the inflectional wordforms and derivatives (global variants), cf. fantazijní TILDE OPERATOR+D91 fantasijní in the adjective derived from the noun fantazie TILDE OPERATOR+D91 fantasie (fantasy). It is reasonable to distinguish the global variants as different words but to have formal means that interconnect them in the Natural Language Processing systems and resources. In this paper, we describe the identification of global variants in the Czech vocabulary and summarise new changes in the MorfFlex CZ dictionary and DeriNet lexicon concerning this type of variants. We reviewed several typical patterns within global variants captured in the available resources and combined a set of regular expressions with manual annotations to achieve the highest precision of the identification.
Klasifikace
Druh
O - Ostatní výsledky
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů