Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

A comparative study of cross-lingual sentiment analysis

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F24%3A43971423" target="_blank" >RIV/49777513:23520/24:43971423 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://www.sciencedirect.com/science/article/pii/S095741742400112X" target="_blank" >https://www.sciencedirect.com/science/article/pii/S095741742400112X</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1016/j.eswa.2024.123247" target="_blank" >10.1016/j.eswa.2024.123247</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    A comparative study of cross-lingual sentiment analysis

  • Popis výsledku v původním jazyce

    This paper presents a detailed comparative study of the zero-shot cross-lingual sentiment analysis. Namely, we use modern multilingual Transformer-based models and linear transformations combined with CNN and LSTM neural networks. We evaluate their performance in Czech, French, and English. We aim to compare and assess the models’ ability to transfer knowledge across languages and discuss the trade-off between their performance and training/inference speed. We build strong monolingual baselines comparable with the current SotA approaches, achieving state-of-the-art results in Czech (96.0% accuracy) and French (97.6% accuracy). Next, we compare our results with the latest large language models (LLMs), i.e., Llama 2 and ChatGPT. We show that the large multilingual Transformer-based XLM-R model consistently outperforms all other cross-lingual approaches in zero-shot cross-lingual sentiment classification, surpassing them by at least 3%. Next, we show that the smaller Transformer-based models are comparable in performance to older but much faster methods with linear transformations. The best-performing model with linear transformation achieved an accuracy of 92.1% on the French dataset, compared to 90.3% received by the smaller XLM-R model. Notably, this performance is achieved with just approximately 0.01 of the training time required for the XLM-R model. It underscores the potential of linear transformations as a pragmatic alternative to resource-intensive and slower Transformer-based models in real-world applications. The LLMs achieved impressive results that are on par or better, at least by 1%–3%, but with additional hardware requirements and limitations. Overall, this study contributes to understanding cross-lingual sentiment analysis and provides valuable insights into the strengths and limitations of cross-lingual approaches for sentiment analysis

  • Název v anglickém jazyce

    A comparative study of cross-lingual sentiment analysis

  • Popis výsledku anglicky

    This paper presents a detailed comparative study of the zero-shot cross-lingual sentiment analysis. Namely, we use modern multilingual Transformer-based models and linear transformations combined with CNN and LSTM neural networks. We evaluate their performance in Czech, French, and English. We aim to compare and assess the models’ ability to transfer knowledge across languages and discuss the trade-off between their performance and training/inference speed. We build strong monolingual baselines comparable with the current SotA approaches, achieving state-of-the-art results in Czech (96.0% accuracy) and French (97.6% accuracy). Next, we compare our results with the latest large language models (LLMs), i.e., Llama 2 and ChatGPT. We show that the large multilingual Transformer-based XLM-R model consistently outperforms all other cross-lingual approaches in zero-shot cross-lingual sentiment classification, surpassing them by at least 3%. Next, we show that the smaller Transformer-based models are comparable in performance to older but much faster methods with linear transformations. The best-performing model with linear transformation achieved an accuracy of 92.1% on the French dataset, compared to 90.3% received by the smaller XLM-R model. Notably, this performance is achieved with just approximately 0.01 of the training time required for the XLM-R model. It underscores the potential of linear transformations as a pragmatic alternative to resource-intensive and slower Transformer-based models in real-world applications. The LLMs achieved impressive results that are on par or better, at least by 1%–3%, but with additional hardware requirements and limitations. Overall, this study contributes to understanding cross-lingual sentiment analysis and provides valuable insights into the strengths and limitations of cross-lingual approaches for sentiment analysis

Klasifikace

  • Druh

    J<sub>imp</sub> - Článek v periodiku v databázi Web of Science

  • CEP obor

  • OECD FORD obor

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

  • Projekt

  • Návaznosti

    S - Specificky vyzkum na vysokych skolach

Ostatní

  • Rok uplatnění

    2024

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název periodika

    Expert Systems with Applications

  • ISSN

    0957-4174

  • e-ISSN

    1873-6793

  • Svazek periodika

    247

  • Číslo periodika v rámci svazku

    AUG 1 2024

  • Stát vydavatele periodika

    NL - Nizozemsko

  • Počet stran výsledku

    39

  • Strana od-do

  • Kód UT WoS článku

    001171252000001

  • EID výsledku v databázi Scopus

    2-s2.0-85185192813