Enhancing Cross-Lingual Sarcasm Detection by a Prompt Learning Framework with Data Augmentation and Contrastive Learning
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3ABRAGIEAZ" target="_blank" >RIV/00216208:11320/25:BRAGIEAZ - isvavai.cz</a>
Výsledek na webu
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85195852507&doi=10.3390%2felectronics13112163&partnerID=40&md5=acd45b1c79d788cbdaa5cebc3f475c67" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85195852507&doi=10.3390%2felectronics13112163&partnerID=40&md5=acd45b1c79d788cbdaa5cebc3f475c67</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/electronics13112163" target="_blank" >10.3390/electronics13112163</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Enhancing Cross-Lingual Sarcasm Detection by a Prompt Learning Framework with Data Augmentation and Contrastive Learning
Popis výsledku v původním jazyce
Given their intricate nature and inherent ambiguity, sarcastic texts often mask deeper emotions, making it challenging to discern the genuine feelings behind the words. The proposal of the sarcasm detection task is to assist us with more accurately understanding the true intention of the speaker. Advanced methods, such as deep learning and neural networks, are widely used in the field of sarcasm detection. However, most research mainly focuses on sarcastic texts in English, as other languages lack corpora and annotated datasets. To address the challenge of low-resource languages in sarcasm detection tasks, a zero-shot cross-lingual transfer learning method is proposed in this paper. The proposed approach is based on prompt learning and aims to assist the model with understanding downstream tasks through prompts. Specifically, the model uses prompt templates to construct training data into cloze-style questions and then trains them using a pre-trained cross-lingual language model. Combining data augmentation and contrastive learning can further improve the capacity of the model for cross-lingual transfer learning. To evaluate the performance of the proposed model, we utilize a publicly accessible sarcasm dataset in English as training data in a zero-shot cross-lingual setting. When tested with Chinese as the target language for transfer, our model achieves F1-scores of 72.14% and 76.7% on two test datasets, outperforming the strong baselines by significant margins. © 2024 by the authors.
Název v anglickém jazyce
Enhancing Cross-Lingual Sarcasm Detection by a Prompt Learning Framework with Data Augmentation and Contrastive Learning
Popis výsledku anglicky
Given their intricate nature and inherent ambiguity, sarcastic texts often mask deeper emotions, making it challenging to discern the genuine feelings behind the words. The proposal of the sarcasm detection task is to assist us with more accurately understanding the true intention of the speaker. Advanced methods, such as deep learning and neural networks, are widely used in the field of sarcasm detection. However, most research mainly focuses on sarcastic texts in English, as other languages lack corpora and annotated datasets. To address the challenge of low-resource languages in sarcasm detection tasks, a zero-shot cross-lingual transfer learning method is proposed in this paper. The proposed approach is based on prompt learning and aims to assist the model with understanding downstream tasks through prompts. Specifically, the model uses prompt templates to construct training data into cloze-style questions and then trains them using a pre-trained cross-lingual language model. Combining data augmentation and contrastive learning can further improve the capacity of the model for cross-lingual transfer learning. To evaluate the performance of the proposed model, we utilize a publicly accessible sarcasm dataset in English as training data in a zero-shot cross-lingual setting. When tested with Chinese as the target language for transfer, our model achieves F1-scores of 72.14% and 76.7% on two test datasets, outperforming the strong baselines by significant margins. © 2024 by the authors.
Klasifikace
Druh
J<sub>SC</sub> - Článek v periodiku v databázi SCOPUS
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Electronics (Switzerland)
ISSN
2079-9292
e-ISSN
—
Svazek periodika
13
Číslo periodika v rámci svazku
11
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
14
Strana od-do
1-14
Kód UT WoS článku
—
EID výsledku v databázi Scopus
2-s2.0-85195852507