Unsupervised Word Sense Disambiguation Using Word Embeddings
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F19%3A10405583" target="_blank" >RIV/00216208:11320/19:10405583 - isvavai.cz</a>
Výsledek na webu
<a href="https://fruct.org/publications/fruct25/files/Mor.pdf" target="_blank" >https://fruct.org/publications/fruct25/files/Mor.pdf</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Unsupervised Word Sense Disambiguation Using Word Embeddings
Popis výsledku v původním jazyce
Word sense disambiguation is the task of assigning the correct sense of a polysemous word in the context in which it appears. In recent years, word embeddings have been applied successfully to many NLP tasks. Thanks to their ability to capture distributional semantics, more recent attention have been focused on utilizing word embeddings to disambiguate words. In this paper, a novel unsupervised method is proposed to disambiguate words from the first language by deploying a trained word embeddings model of the second language using only a bilingual dictionary. While the translated words are useful clues for the disambiguation process, the main idea of this work is to use the information provided by English-translated surrounding words to disambiguate Persian words using trained English word2vec; well-known word embeddings model. Each translation of the polysemous word is compared against word embeddings of translated surrounding words to calculate word similarity scores and the most similar word to vec
Název v anglickém jazyce
Unsupervised Word Sense Disambiguation Using Word Embeddings
Popis výsledku anglicky
Word sense disambiguation is the task of assigning the correct sense of a polysemous word in the context in which it appears. In recent years, word embeddings have been applied successfully to many NLP tasks. Thanks to their ability to capture distributional semantics, more recent attention have been focused on utilizing word embeddings to disambiguate words. In this paper, a novel unsupervised method is proposed to disambiguate words from the first language by deploying a trained word embeddings model of the second language using only a bilingual dictionary. While the translated words are useful clues for the disambiguation process, the main idea of this work is to use the information provided by English-translated surrounding words to disambiguate Persian words using trained English word2vec; well-known word embeddings model. Each translation of the polysemous word is compared against word embeddings of translated surrounding words to calculate word similarity scores and the most similar word to vec
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2019
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of the 25th Conference of Open Innovations Association FRUCT 2019
ISBN
978-952-69244-0-3
ISSN
2305-7254
e-ISSN
—
Počet stran výsledku
6
Strana od-do
228-233
Název nakladatele
Finnish-Russian University Cooperation in Telecommunications
Místo vydání
Helsinki, Finland
Místo konání akce
Helsinki, Finland
Datum konání akce
5. 11. 2019
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—