Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F19%3A10405528" target="_blank" >RIV/00216208:11320/19:10405528 - isvavai.cz</a>
Výsledek na webu
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=PazElY0WnY" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=PazElY0WnY</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.13053/CyS-23-3-3265" target="_blank" >10.13053/CyS-23-3-3265</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed
Popis výsledku v původním jazyce
The utility of linguistic annotation in neural machine translation seemed to had been established in past papers. The experiments were however limited to recurrent sequence-to-sequence architectures and relatively small data settings. We focus on the state-of-the-art Transformer model and use comparably larger corpora. Specifically, we try to promote the knowledge of source-side syntax using multi-task learning either through simple data manipulation techniques or through a dedicated model component. In particular, we train one of Transformer attention heads to produce source-side dependency tree. Overall, our results cast some doubt on the utility of multi-task setups with linguistic information. The data manipulation techniques, recommended in previous works, prove ineffective in large data settings. The treatment of self-attention as dependencies seems much more promising: it helps in translation and reveals that Transformer model can very easily grasp the syntactic structure. An important but curi
Název v anglickém jazyce
Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed
Popis výsledku anglicky
The utility of linguistic annotation in neural machine translation seemed to had been established in past papers. The experiments were however limited to recurrent sequence-to-sequence architectures and relatively small data settings. We focus on the state-of-the-art Transformer model and use comparably larger corpora. Specifically, we try to promote the knowledge of source-side syntax using multi-task learning either through simple data manipulation techniques or through a dedicated model component. In particular, we train one of Transformer attention heads to produce source-side dependency tree. Overall, our results cast some doubt on the utility of multi-task setups with linguistic information. The data manipulation techniques, recommended in previous works, prove ineffective in large data settings. The treatment of self-attention as dependencies seems much more promising: it helps in translation and reveals that Transformer model can very easily grasp the syntactic structure. An important but curi

Klasifikace

Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
<a href="/cs/project/GX19-26934X" target="_blank" >GX19-26934X: Neuronové reprezentace v multimodálním a mnohojazyčném modelování</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2019
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Computacion y Sistemas
ISSN
1405-5546
e-ISSN
—
Svazek periodika
23
Číslo periodika v rámci svazku
3
Stát vydavatele periodika
MX - Spojené státy mexické
Počet stran výsledku
12
Strana od-do
923-934
Kód UT WoS článku
000489136900029
EID výsledku v databázi Scopus
2-s2.0-85076684081

Podobné výsledky(10)

Input Combination Strategies for Multi-Source Transformer Decoder Attention Strategies for Multi-Source Sequence-to-Sequence Learning Improving Translation Model by Monolingual Data

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)