Cross-lingual Adaptation Using Universal Dependencies
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F21%3A10441595" target="_blank" >RIV/00216208:11320/21:10441595 - isvavai.cz</a>
Výsledek na webu
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=dZhh3n11ec" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=dZhh3n11ec</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1145/3448251" target="_blank" >10.1145/3448251</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Cross-lingual Adaptation Using Universal Dependencies
Popis výsledku v původním jazyce
We describe a cross-lingual adaptation method based on syntactic parse trees obtained from the Universal Dependencies (UD), which are consistent across languages, to develop classifiers in low-resource languages. The idea of UD parsing is to capture similarities as well as idiosyncrasies among typologically different languages. In this article, we show that models trained using UD parse trees for complex NLP tasks can characterize very different languages. We study two tasks of paraphrase identification and relation extraction as case studies. Based on UD parse trees, we develop several models using tree kernels and show that these models trained on the English dataset can correctly classify data of other languages, e.g., French, Farsi, and Arabic. The proposed approach opens up avenues for exploiting UD parsing in solving similar cross-lingual tasks, which is very useful for languages for which no labeled data is available.
Název v anglickém jazyce
Cross-lingual Adaptation Using Universal Dependencies
Popis výsledku anglicky
We describe a cross-lingual adaptation method based on syntactic parse trees obtained from the Universal Dependencies (UD), which are consistent across languages, to develop classifiers in low-resource languages. The idea of UD parsing is to capture similarities as well as idiosyncrasies among typologically different languages. In this article, we show that models trained using UD parse trees for complex NLP tasks can characterize very different languages. We study two tasks of paraphrase identification and relation extraction as case studies. Based on UD parse trees, we develop several models using tree kernels and show that these models trained on the English dataset can correctly classify data of other languages, e.g., French, Farsi, and Arabic. The proposed approach opens up avenues for exploiting UD parsing in solving similar cross-lingual tasks, which is very useful for languages for which no labeled data is available.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
ACM Transactions on Asian and Low-Resource Language Information Processing
ISSN
2375-4699
e-ISSN
2375-4702
Svazek periodika
20
Číslo periodika v rámci svazku
4
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
23
Strana od-do
65
Kód UT WoS článku
000721582900012
EID výsledku v databázi Scopus
2-s2.0-85111026083