Building Indonesian Dependency Parser Using Cross-lingual Transfer Learning
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F22%3ASNZLRUMG" target="_blank" >RIV/00216208:11320/22:SNZLRUMG - isvavai.cz</a>
Result on the web
<a href="https://doi.org/10.1109/IALP57159.2022.9961296" target="_blank" >https://doi.org/10.1109/IALP57159.2022.9961296</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/IALP57159.2022.9961296" target="_blank" >10.1109/IALP57159.2022.9961296</a>
Alternative languages
Result language
angličtina
Original language name
Building Indonesian Dependency Parser Using Cross-lingual Transfer Learning
Original language description
In recent years, cross-lingual transfer learning has been gaining positive trends across NLP tasks. This research aims to develop a dependency parser for Indonesian using cross-lingual transfer learning. The dependency parser uses a Transformer as the encoder layer and a deep biaffine attention decoder as the decoder layer. The model is trained using a transfer learning approach from a source language to our target language with fine-tuning. We choose four languages as the source domain for comparison: French, Italian, Slovenian, and English. Our proposed approach is able to improve the performance of the dependency parser model for Indonesian as the target domain on both same-domain and cross-domain testing. Compared to the baseline model, our best model increases UAS up to 4.31% and LAS up to 4.46%. Among the chosen source languages of dependency treebanks, French and Italian that are selected based on LangRank output perform better than other languages selected based on other criteria. French, which has the highest rank from LangRank, performs the best on cross-lingual transfer learning for the dependency parser model.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2022
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
2022 International Conference on Asian Language Processing (IALP)
ISBN
978-1-66547-674-4
ISSN
—
e-ISSN
—
Number of pages
6
Pages from-to
488-493
Publisher name
IEEE
Place of publication
—
Event location
Singapore, Singapore
Event date
Jan 1, 2022
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000896159700083