A Smaller and Better Word Embedding for Neural Machine Translation
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3ASGHGDUL7" target="_blank" >RIV/00216208:11320/23:SGHGDUL7 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.webofscience.com/wos/woscc/summary/e0b8ef34-8e6b-412a-9b8f-87607433ed44-bb92f483/relevance/1" target="_blank" >https://www.webofscience.com/wos/woscc/summary/e0b8ef34-8e6b-412a-9b8f-87607433ed44-bb92f483/relevance/1</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/access.2023.3270171" target="_blank" >10.1109/access.2023.3270171</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
A Smaller and Better Word Embedding for Neural Machine Translation
Popis výsledku v původním jazyce
"Word embeddings play an important role in Neural Machine Translation (NMT). However, it still has a series of problems such as ignoring the prior knowledge of the association between words, relying on specific task constraints passively in parameter training, and isolating individual embedding's learning process from one another. In this paper, we propose a new word embedding method to add the prior knowledge of the association between words to the training process, and at the same time to share the iterative training results among all word embeddings. This method is applicable to all mainstream NMT systems. In our experiments, it achieves an improvement of +0.9 BLEU points on the WMT'14 English?German task. On the Global Voices v2018q4 Spanish?Czech low-resource translation tasks, it leads to a more prominent performance improvement over the strong baselines (a +2.6 BLEU improvement on average). As another "bonus", the new word embedding has far fewer parameters than the traditional word embedding, even as low as 15% of the parameters of the baselines."
Název v anglickém jazyce
A Smaller and Better Word Embedding for Neural Machine Translation
Popis výsledku anglicky
"Word embeddings play an important role in Neural Machine Translation (NMT). However, it still has a series of problems such as ignoring the prior knowledge of the association between words, relying on specific task constraints passively in parameter training, and isolating individual embedding's learning process from one another. In this paper, we propose a new word embedding method to add the prior knowledge of the association between words to the training process, and at the same time to share the iterative training results among all word embeddings. This method is applicable to all mainstream NMT systems. In our experiments, it achieves an improvement of +0.9 BLEU points on the WMT'14 English?German task. On the Global Voices v2018q4 Spanish?Czech low-resource translation tasks, it leads to a more prominent performance improvement over the strong baselines (a +2.6 BLEU improvement on average). As another "bonus", the new word embedding has far fewer parameters than the traditional word embedding, even as low as 15% of the parameters of the baselines."
Klasifikace
Druh
J<sub>ost</sub> - Ostatní články v recenzovaných periodicích
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
"IEEE ACCESS"
ISSN
2169-3536
e-ISSN
—
Svazek periodika
11
Číslo periodika v rámci svazku
2023
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
9
Strana od-do
40770-40778
Kód UT WoS článku
001033140800001
EID výsledku v databázi Scopus
—