An accurate transformer-based model for transition-based dependency parsing of free word order languages
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3AYJ3TDMYL" target="_blank" >RIV/00216208:11320/25:YJ3TDMYL - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85196791144&doi=10.1016%2fj.jksuci.2024.102107&partnerID=40&md5=53c288a4abdb146ff518c1db179c9722" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85196791144&doi=10.1016%2fj.jksuci.2024.102107&partnerID=40&md5=53c288a4abdb146ff518c1db179c9722</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.jksuci.2024.102107" target="_blank" >10.1016/j.jksuci.2024.102107</a>
Alternative languages
Result language
angličtina
Original language name
An accurate transformer-based model for transition-based dependency parsing of free word order languages
Original language description
Transformer models are the state-of-the-art in Natural Language Processing (NLP) and the core of the Large Language Models (LLMs). We propose a transformer-based model for transition-based dependency parsing of free word order languages. We have performed experiments on five treebanks from the Universal Dependencies (UD) dataset version 2.12. Our experiments show that a transformer model, trained with the dynamic word embeddings performs better than a multilayer perceptron trained on the state-of-the-art static word embeddings even if the dynamic word embeddings have a vocabulary size ten times smaller than the static word embeddings. The results show that the transformer trained on dynamic word embeddings achieves an unlabeled attachment score (UAS) of 84.17% for Urdu language which is approximate to 3 . 6% and approximate to 1 . 9% higher than the UAS scores of 80.56857% and 82.26859% achieved by the multilayer perceptron (MLP) using two static state-ofthe-art word embeddings. The proposed approach is investigated for Arabic, Persian and Uyghur languages, in addition to Urdu, for UAS scores and the results suggest that the proposed solution outperform the MLP-based approaches.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES
ISSN
1319-1578
e-ISSN
2213-1248
Volume of the periodical
36
Issue of the periodical within the volume
6
Country of publishing house
US - UNITED STATES
Number of pages
12
Pages from-to
1-12
UT code for WoS article
001261229500001
EID of the result in the Scopus database
2-s2.0-85196791144