Lexical data mining-based approach for the self-enrichment of LMF standardized dictionaries: Case of the syntactico-semantic knowledge
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F21%3A10441625" target="_blank" >RIV/00216208:11320/21:10441625 - isvavai.cz</a>
Výsledek na webu
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=Itcfb.rlbg" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=Itcfb.rlbg</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1002/cpe.6312" target="_blank" >10.1002/cpe.6312</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Lexical data mining-based approach for the self-enrichment of LMF standardized dictionaries: Case of the syntactico-semantic knowledge
Popis výsledku v původním jazyce
The LMF ISO standard provides a large cover of lexical knowledge using a fine structure. However, like most of the electronic dictionaries, the available normalized LMF dictionaries comprise only basic morpho-syntactic and semantic knowledge, such as the meanings of lexical entries through the definitions and the associated examples, and sometimes the indication of the synonyms and antonyms. Other sophisticated knowledge, such as the syntactic behaviors, semantic classes and syntactico-semantic links, which are scarce, requires a high expertise and its adding to dictionaries is expensive. In fact in this paper, we propose an approach of lexical data mining of the widely available textual content associated with the meanings, notably in the normalized LMF dictionaries, in order to perform the self-enrichment of these dictionaries. First, we contribute to the enrichment of the syntactic behaviors by linking them to the suitable meanings. Second, we focus on the enrichment of the meanings of LMF lexical entries with semantic classes based on the Gaston Gross semantic classification. Finally, we establish the syntactico-semantic links based on the results of the syntactic and semantic enrichment processes. The proposed approach has been consolidated by an experimentation carried out on an available normalized LMF dictionary for Arabic language.
Název v anglickém jazyce
Lexical data mining-based approach for the self-enrichment of LMF standardized dictionaries: Case of the syntactico-semantic knowledge
Popis výsledku anglicky
The LMF ISO standard provides a large cover of lexical knowledge using a fine structure. However, like most of the electronic dictionaries, the available normalized LMF dictionaries comprise only basic morpho-syntactic and semantic knowledge, such as the meanings of lexical entries through the definitions and the associated examples, and sometimes the indication of the synonyms and antonyms. Other sophisticated knowledge, such as the syntactic behaviors, semantic classes and syntactico-semantic links, which are scarce, requires a high expertise and its adding to dictionaries is expensive. In fact in this paper, we propose an approach of lexical data mining of the widely available textual content associated with the meanings, notably in the normalized LMF dictionaries, in order to perform the self-enrichment of these dictionaries. First, we contribute to the enrichment of the syntactic behaviors by linking them to the suitable meanings. Second, we focus on the enrichment of the meanings of LMF lexical entries with semantic classes based on the Gaston Gross semantic classification. Finally, we establish the syntactico-semantic links based on the results of the syntactic and semantic enrichment processes. The proposed approach has been consolidated by an experimentation carried out on an available normalized LMF dictionary for Arabic language.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Concurrency Computation Practice and Experience
ISSN
1532-0626
e-ISSN
1532-0634
Svazek periodika
33
Číslo periodika v rámci svazku
17
Stát vydavatele periodika
GB - Spojené království Velké Británie a Severního Irska
Počet stran výsledku
32
Strana od-do
e6312
Kód UT WoS článku
000640935900001
EID výsledku v databázi Scopus
2-s2.0-85104407123