Building Indonesian Dependency Parser Using Cross-lingual Transfer Learning
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F22%3ASNZLRUMG" target="_blank" >RIV/00216208:11320/22:SNZLRUMG - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1109/IALP57159.2022.9961296" target="_blank" >https://doi.org/10.1109/IALP57159.2022.9961296</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/IALP57159.2022.9961296" target="_blank" >10.1109/IALP57159.2022.9961296</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Building Indonesian Dependency Parser Using Cross-lingual Transfer Learning
Popis výsledku v původním jazyce
In recent years, cross-lingual transfer learning has been gaining positive trends across NLP tasks. This research aims to develop a dependency parser for Indonesian using cross-lingual transfer learning. The dependency parser uses a Transformer as the encoder layer and a deep biaffine attention decoder as the decoder layer. The model is trained using a transfer learning approach from a source language to our target language with fine-tuning. We choose four languages as the source domain for comparison: French, Italian, Slovenian, and English. Our proposed approach is able to improve the performance of the dependency parser model for Indonesian as the target domain on both same-domain and cross-domain testing. Compared to the baseline model, our best model increases UAS up to 4.31% and LAS up to 4.46%. Among the chosen source languages of dependency treebanks, French and Italian that are selected based on LangRank output perform better than other languages selected based on other criteria. French, which has the highest rank from LangRank, performs the best on cross-lingual transfer learning for the dependency parser model.
Název v anglickém jazyce
Building Indonesian Dependency Parser Using Cross-lingual Transfer Learning
Popis výsledku anglicky
In recent years, cross-lingual transfer learning has been gaining positive trends across NLP tasks. This research aims to develop a dependency parser for Indonesian using cross-lingual transfer learning. The dependency parser uses a Transformer as the encoder layer and a deep biaffine attention decoder as the decoder layer. The model is trained using a transfer learning approach from a source language to our target language with fine-tuning. We choose four languages as the source domain for comparison: French, Italian, Slovenian, and English. Our proposed approach is able to improve the performance of the dependency parser model for Indonesian as the target domain on both same-domain and cross-domain testing. Compared to the baseline model, our best model increases UAS up to 4.31% and LAS up to 4.46%. Among the chosen source languages of dependency treebanks, French and Italian that are selected based on LangRank output perform better than other languages selected based on other criteria. French, which has the highest rank from LangRank, performs the best on cross-lingual transfer learning for the dependency parser model.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
2022 International Conference on Asian Language Processing (IALP)
ISBN
978-1-66547-674-4
ISSN
—
e-ISSN
—
Počet stran výsledku
6
Strana od-do
488-493
Název nakladatele
IEEE
Místo vydání
—
Místo konání akce
Singapore, Singapore
Datum konání akce
1. 1. 2022
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000896159700083