Semi-Automatic Construction of Word-Formation Networks
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F20%3A10424328" target="_blank" >RIV/00216208:11320/20:10424328 - isvavai.cz</a>
Výsledek na webu
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=98crtcq4Sr" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=98crtcq4Sr</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10579-019-09484-2" target="_blank" >10.1007/s10579-019-09484-2</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Semi-Automatic Construction of Word-Formation Networks
Popis výsledku v původním jazyce
The article presents a semi-automatic method for the construction of word-formation networks focusing particularly on derivation. The proposed approach applies a sequential pattern mining technique to construct useful morphological features in an unsupervised manner. The features take the form of regular expressions and later are used to feed a machine-learned ranking model. The network is constructed by applying the learned model to sort the lists of possible base words and selecting the most probable ones. This approach, besides relatively small training set and a lexicon, does not require any additional language resources such as a list of vowel and consonant alternations, part-of-speech tags etc. The proposed approach is evaluated on lexeme sets of four languages, namely Polish, Spanish, Czech, and French. The conducted experiments demonstrate the ability of the proposed method to construct linguistically adequate word-formation networks from small training sets. Furthermore, the performed feasibi
Název v anglickém jazyce
Semi-Automatic Construction of Word-Formation Networks
Popis výsledku anglicky
The article presents a semi-automatic method for the construction of word-formation networks focusing particularly on derivation. The proposed approach applies a sequential pattern mining technique to construct useful morphological features in an unsupervised manner. The features take the form of regular expressions and later are used to feed a machine-learned ranking model. The network is constructed by applying the learned model to sort the lists of possible base words and selecting the most probable ones. This approach, besides relatively small training set and a lexicon, does not require any additional language resources such as a list of vowel and consonant alternations, part-of-speech tags etc. The proposed approach is evaluated on lexeme sets of four languages, namely Polish, Spanish, Czech, and French. The conducted experiments demonstrate the ability of the proposed method to construct linguistically adequate word-formation networks from small training sets. Furthermore, the performed feasibi
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2020
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Language Resources and Evaluation
ISSN
1574-020X
e-ISSN
—
Svazek periodika
54
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
NL - Nizozemsko
Počet stran výsledku
30
Strana od-do
1-30
Kód UT WoS článku
000636183900002
EID výsledku v databázi Scopus
2-s2.0-85078335469