Translating Short Segments with NMT: A Case Study in English-to-Hindi
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F18%3A10390198" target="_blank" >RIV/00216208:11320/18:10390198 - isvavai.cz</a>
Result on the web
<a href="http://rua.ua.es/dspace/handle/10045/76083" target="_blank" >http://rua.ua.es/dspace/handle/10045/76083</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Translating Short Segments with NMT: A Case Study in English-to-Hindi
Original language description
This paper presents a case study in translating short image captions of the Visual Genome dataset from English into Hindi using out-of-domain data sets of varying size. We experiment with three NMT models: the shallow and deep sequence-to-sequence and the Transformer model as implemented in Marian toolkit. Phrase-based Moses serves as the baseline. The results indicate that the Transformer model outperforms others in the large data setting in a number of automatic metrics and manual evaluation, and it also produces the fewest truncated sentences. Transformer training is however very sensitive to the hyperparameters, so it requires more experimenting. The deep sequence-to-sequence model produced more flawless outputs in the small data setting and it was generally more stable, at the cost of more training iterations.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/LM2015071" target="_blank" >LM2015071: Language Research Infrastructure in the Czech Republic</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2018
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the 21st Annual Conference of the European Association for Machine Translation (2018)
ISBN
978-84-09-01901-4
ISSN
—
e-ISSN
neuvedeno
Number of pages
392
Pages from-to
1-392
Publisher name
European Association for Machine Translation
Place of publication
Allschwil, Switzerland
Event location
Alicante, Spain
Event date
May 28, 2018
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—