Sarcasm Detection on Czech and English Twitter

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F14%3A43923001" target="_blank" >RIV/49777513:23520/14:43923001 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Sarcasm Detection on Czech and English Twitter
Popis výsledku v původním jazyce
This paper presents a machine learning approach to sarcasm detection on Twitter in two languages -- English and Czech. This is the first attempt at sarcasm detection in the Czech language. We created a large Czech Twitter corpus consisting of 7,000 manually-labelled tweets and provide it to the community. We evaluate two classifiers with various combinations of features on both the Czech and English datasets. Furthermore, we tackle the issues of rich Czech morphology by examining different pre-processing techniques. Experiments show that our language-independent approach significantly outperforms adapted state-of-the-art methods in English (F-measure 0.947) and also represents a strong baseline for further research in Czech (F-measure 0.582).
Název v anglickém jazyce
Sarcasm Detection on Czech and English Twitter
Popis výsledku anglicky
This paper presents a machine learning approach to sarcasm detection on Twitter in two languages -- English and Czech. This is the first attempt at sarcasm detection in the Czech language. We created a large Czech Twitter corpus consisting of 7,000 manually-labelled tweets and provide it to the community. We evaluate two classifiers with various combinations of features on both the Czech and English datasets. Furthermore, we tackle the issues of rich Czech morphology by examining different pre-processing techniques. Experiments show that our language-independent approach significantly outperforms adapted state-of-the-art methods in English (F-measure 0.947) and also represents a strong baseline for further research in Czech (F-measure 0.582).

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
JD - Využití počítačů, robotika a její aplikace
OECD FORD obor
—

Návaznosti výsledku

Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2014
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
ISBN
978-1-941643-26-6
ISSN
—
e-ISSN
—
Počet stran výsledku
11
Strana od-do
213-223
Název nakladatele
neuveden
Místo vydání
neuveden
Místo konání akce
Dublin
Datum konání akce
23. 8. 2014
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Enhancing Cross-Lingual Sarcasm Detection by a Prompt Learning Framework with Data Augmentation and Contrastive Learning The unbearable hurtfulness of sarcasm Identification of Multiword Expressions in Tweets for Hate Speech Detection

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Sarcasm Detection on Czech and English Twitter

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)