A method for constructing word sense embeddings based on word sense induction

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F23%3A10254662" target="_blank" >RIV/61989100:27240/23:10254662 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.nature.com/articles/s41598-023-40062-3" target="_blank" >https://www.nature.com/articles/s41598-023-40062-3</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1038/s41598-023-40062-3" target="_blank" >10.1038/s41598-023-40062-3</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
A method for constructing word sense embeddings based on word sense induction
Popis výsledku v původním jazyce
Polysemy is an inherent characteristic of natural language. In order to make it easier to distinguish between different senses of polysemous words, we propose a method for encoding multiple different senses of polysemous words using a single vector. The method first uses a two-layer bidirectional long short-term memory neural network and a self-attention mechanism to extract the contextual information of polysemous words. Then, a K-means algorithm, which is improved by optimizing the density peaks clustering algorithm based on cosine similarity, is applied to perform word sense induction on the contextual information of polysemous words. Finally, the method constructs the corresponding word sense embedded representations of the polysemous words. The results of the experiments demonstrate that the proposed method produces better word sense induction than Euclidean distance, Pearson correlation, and KL-divergence and more accurate word sense embeddings than mean shift, DBSCAN, spectral clustering, and agglomerative clustering. (C) 2023, Springer Nature Limited.
Název v anglickém jazyce
A method for constructing word sense embeddings based on word sense induction
Popis výsledku anglicky
Polysemy is an inherent characteristic of natural language. In order to make it easier to distinguish between different senses of polysemous words, we propose a method for encoding multiple different senses of polysemous words using a single vector. The method first uses a two-layer bidirectional long short-term memory neural network and a self-attention mechanism to extract the contextual information of polysemous words. Then, a K-means algorithm, which is improved by optimizing the density peaks clustering algorithm based on cosine similarity, is applied to perform word sense induction on the contextual information of polysemous words. Finally, the method constructs the corresponding word sense embedded representations of the polysemous words. The results of the experiments demonstrate that the proposed method produces better word sense induction than Euclidean distance, Pearson correlation, and KL-divergence and more accurate word sense embeddings than mean shift, DBSCAN, spectral clustering, and agglomerative clustering. (C) 2023, Springer Nature Limited.

Klasifikace

Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Scientific Reports
ISSN
2045-2322
e-ISSN
—
Svazek periodika
13
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
13
Strana od-do
—
Kód UT WoS článku
001045574100067
EID výsledku v databázi Scopus
2-s2.0-85167532342

Podobné výsledky(10)

Unsupervised Word Sense Disambiguation Using Word Embeddings Word Sense Induction Using Word Sketches Disambiguace mnohoznačných slov pomocí vizuální informace

Co hledáte?

Rychlé hledání

Chytré vyhledávání

A method for constructing word sense embeddings based on word sense induction

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)