Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F19%3A10405528" target="_blank" >RIV/00216208:11320/19:10405528 - isvavai.cz</a>
Výsledek na webu
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=PazElY0WnY" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=PazElY0WnY</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.13053/CyS-23-3-3265" target="_blank" >10.13053/CyS-23-3-3265</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed
Popis výsledku v původním jazyce
The utility of linguistic annotation in neural machine translation seemed to had been established in past papers. The experiments were however limited to recurrent sequence-to-sequence architectures and relatively small data settings. We focus on the state-of-the-art Transformer model and use comparably larger corpora. Specifically, we try to promote the knowledge of source-side syntax using multi-task learning either through simple data manipulation techniques or through a dedicated model component. In particular, we train one of Transformer attention heads to produce source-side dependency tree. Overall, our results cast some doubt on the utility of multi-task setups with linguistic information. The data manipulation techniques, recommended in previous works, prove ineffective in large data settings. The treatment of self-attention as dependencies seems much more promising: it helps in translation and reveals that Transformer model can very easily grasp the syntactic structure. An important but curi
Název v anglickém jazyce
Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed
Popis výsledku anglicky
The utility of linguistic annotation in neural machine translation seemed to had been established in past papers. The experiments were however limited to recurrent sequence-to-sequence architectures and relatively small data settings. We focus on the state-of-the-art Transformer model and use comparably larger corpora. Specifically, we try to promote the knowledge of source-side syntax using multi-task learning either through simple data manipulation techniques or through a dedicated model component. In particular, we train one of Transformer attention heads to produce source-side dependency tree. Overall, our results cast some doubt on the utility of multi-task setups with linguistic information. The data manipulation techniques, recommended in previous works, prove ineffective in large data settings. The treatment of self-attention as dependencies seems much more promising: it helps in translation and reveals that Transformer model can very easily grasp the syntactic structure. An important but curi
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/GX19-26934X" target="_blank" >GX19-26934X: Neuronové reprezentace v multimodálním a mnohojazyčném modelování</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2019
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Computacion y Sistemas
ISSN
1405-5546
e-ISSN
—
Svazek periodika
23
Číslo periodika v rámci svazku
3
Stát vydavatele periodika
MX - Spojené státy mexické
Počet stran výsledku
12
Strana od-do
923-934
Kód UT WoS článku
000489136900029
EID výsledku v databázi Scopus
2-s2.0-85076684081