A Smaller and Better Word Embedding for Neural Machine Translation
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3ASGHGDUL7" target="_blank" >RIV/00216208:11320/23:SGHGDUL7 - isvavai.cz</a>
Result on the web
<a href="https://www.webofscience.com/wos/woscc/summary/e0b8ef34-8e6b-412a-9b8f-87607433ed44-bb92f483/relevance/1" target="_blank" >https://www.webofscience.com/wos/woscc/summary/e0b8ef34-8e6b-412a-9b8f-87607433ed44-bb92f483/relevance/1</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/access.2023.3270171" target="_blank" >10.1109/access.2023.3270171</a>
Alternative languages
Result language
angličtina
Original language name
A Smaller and Better Word Embedding for Neural Machine Translation
Original language description
"Word embeddings play an important role in Neural Machine Translation (NMT). However, it still has a series of problems such as ignoring the prior knowledge of the association between words, relying on specific task constraints passively in parameter training, and isolating individual embedding's learning process from one another. In this paper, we propose a new word embedding method to add the prior knowledge of the association between words to the training process, and at the same time to share the iterative training results among all word embeddings. This method is applicable to all mainstream NMT systems. In our experiments, it achieves an improvement of +0.9 BLEU points on the WMT'14 English?German task. On the Global Voices v2018q4 Spanish?Czech low-resource translation tasks, it leads to a more prominent performance improvement over the strong baselines (a +2.6 BLEU improvement on average). As another "bonus", the new word embedding has far fewer parameters than the traditional word embedding, even as low as 15% of the parameters of the baselines."
Czech name
—
Czech description
—
Classification
Type
J<sub>ost</sub> - Miscellaneous article in a specialist periodical
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
"IEEE ACCESS"
ISSN
2169-3536
e-ISSN
—
Volume of the periodical
11
Issue of the periodical within the volume
2023
Country of publishing house
US - UNITED STATES
Number of pages
9
Pages from-to
40770-40778
UT code for WoS article
001033140800001
EID of the result in the Scopus database
—