Efficiently Reusing Old Models Across Languages via Transfer Learning
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F20%3A10424459" target="_blank" >RIV/00216208:11320/20:10424459 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Efficiently Reusing Old Models Across Languages via Transfer Learning
Original language description
Recent progress in neural machine translation is directed towards larger neural networks trained on an increasing amount of hardware resources. As a result, NMT models are costly to train, both financially, due to the electricity and hardware cost, and environmentally, due to the carbon footprint. It is especially true in transfer learning for its additional cost of training the ''parent'' model before transferring knowledge and training the desired ''child'' model. In this paper, we propose a simple method of re-using an already trained model for different language pairs where there is no need for modifications in model architecture. Our approach does not need a separate parent model for each investigated language pair, as it is typical in NMT transfer learning. To show the applicability of our method, we recycle a Transformer model trained by different researchers and use it to seed models for different language pairs. We achieve better translation quality and shorter convergence times than when tra
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2020
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the 22st Annual Conference of the European Association for Machine Translation (2020)
ISBN
978-989-33-0589-8
ISSN
—
e-ISSN
—
Number of pages
10
Pages from-to
1-10
Publisher name
European Association for Machine Translation
Place of publication
Lisboa, Portugal
Event location
Lisboa, Portugal
Event date
Nov 3, 2020
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—