Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3AAJAYHRXJ" target="_blank" >RIV/00216208:11320/23:AJAYHRXJ - isvavai.cz</a>
Výsledek na webu
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85161395398&doi=10.3390%2fmath11112548&partnerID=40&md5=d8d1ceb79982fced175e76b84cd85ef0" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85161395398&doi=10.3390%2fmath11112548&partnerID=40&md5=d8d1ceb79982fced175e76b84cd85ef0</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/math11112548" target="_blank" >10.3390/math11112548</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation
Popis výsledku v původním jazyce
"Correctly identifying multiword expressions (MWEs) is an important task for most natural language processing systems since their misidentification can result in ambiguity and misunderstanding of the underlying text. In this work, we evaluate the performance of the mBERT model for MWE identification in a multilingual context by training it on all 14 languages available in version 1.2 of the PARSEME corpus. We also incorporate lateral inhibition and language adversarial training into our methodology to create language-independent embeddings and improve its capabilities in identifying multiword expressions. The evaluation of our models shows that the approach employed in this work achieves better results compared to the best system of the PARSEME 1.2 competition, MTLB-STRUCT, on 11 out of 14 languages for global MWE identification and on 12 out of 14 languages for unseen MWE identification. Additionally, averaged across all languages, our best approach outperforms the MTLB-STRUCT system by 1.23% on global MWE identification and by 4.73% on unseen global MWE identification. © 2023 by the authors."
Název v anglickém jazyce
Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation
Popis výsledku anglicky
"Correctly identifying multiword expressions (MWEs) is an important task for most natural language processing systems since their misidentification can result in ambiguity and misunderstanding of the underlying text. In this work, we evaluate the performance of the mBERT model for MWE identification in a multilingual context by training it on all 14 languages available in version 1.2 of the PARSEME corpus. We also incorporate lateral inhibition and language adversarial training into our methodology to create language-independent embeddings and improve its capabilities in identifying multiword expressions. The evaluation of our models shows that the approach employed in this work achieves better results compared to the best system of the PARSEME 1.2 competition, MTLB-STRUCT, on 11 out of 14 languages for global MWE identification and on 12 out of 14 languages for unseen MWE identification. Additionally, averaged across all languages, our best approach outperforms the MTLB-STRUCT system by 1.23% on global MWE identification and by 4.73% on unseen global MWE identification. © 2023 by the authors."
Klasifikace
Druh
J<sub>SC</sub> - Článek v periodiku v databázi SCOPUS
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
"Mathematics"
ISSN
2227-7390
e-ISSN
—
Svazek periodika
11
Číslo periodika v rámci svazku
11
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
18
Strana od-do
1-18
Kód UT WoS článku
—
EID výsledku v databázi Scopus
2-s2.0-85161395398