Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

The Sentiment is in the Details: A Language-agnostic Approach to Dictionary Expansion and Sentence-level Sentiment Analysis in News Media

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F22%3A7FQX4GL6" target="_blank" >RIV/00216208:11320/22:7FQX4GL6 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://www.aup-online.com/content/journals/10.5117/CCR2022.2.003.VRIE" target="_blank" >https://www.aup-online.com/content/journals/10.5117/CCR2022.2.003.VRIE</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.5117/CCR2022.2.003.VRIE" target="_blank" >10.5117/CCR2022.2.003.VRIE</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    The Sentiment is in the Details: A Language-agnostic Approach to Dictionary Expansion and Sentence-level Sentiment Analysis in News Media

  • Popis výsledku v původním jazyce

    Abstract Determining the sentiment in the individual sentences of a newspaper article in an automated fashion is a major challenge. Manually created sentiment dictionaries often fail to meet the required standards. And while computer-generated dictionaries show promise, they are often limited by the availability of suitable linguistic resources. I propose and test a novel, language-agnostic and resource-efficient way of constructing sentiment dictionaries, based on word embedding models. The dictionaries are constructed and evaluated based on four corpora containing two decades of Danish, Dutch (Flanders and the Netherlands), English, and Norwegian newspaper articles, which are cleaned and parsed using Natural Language Processing. Concurrent validity is evaluated using a dataset of human-coded newspaper sentences, and compared to the performance of the Polyglot sentiment dictionaries. Predictive validity is tested through two long-standing hypotheses on the negativity bias in political news. Results show that both the concurrent validity and predictive validity is good. The dictionaries outperform their Polyglot counterparts, and are able to correctly detect a negativity bias, which is stronger for tabloids. The method is resource-efficient in terms of manual labor when compared to manually constructed dictionaries, and requires a limited amount of computational power.

  • Název v anglickém jazyce

    The Sentiment is in the Details: A Language-agnostic Approach to Dictionary Expansion and Sentence-level Sentiment Analysis in News Media

  • Popis výsledku anglicky

    Abstract Determining the sentiment in the individual sentences of a newspaper article in an automated fashion is a major challenge. Manually created sentiment dictionaries often fail to meet the required standards. And while computer-generated dictionaries show promise, they are often limited by the availability of suitable linguistic resources. I propose and test a novel, language-agnostic and resource-efficient way of constructing sentiment dictionaries, based on word embedding models. The dictionaries are constructed and evaluated based on four corpora containing two decades of Danish, Dutch (Flanders and the Netherlands), English, and Norwegian newspaper articles, which are cleaned and parsed using Natural Language Processing. Concurrent validity is evaluated using a dataset of human-coded newspaper sentences, and compared to the performance of the Polyglot sentiment dictionaries. Predictive validity is tested through two long-standing hypotheses on the negativity bias in political news. Results show that both the concurrent validity and predictive validity is good. The dictionaries outperform their Polyglot counterparts, and are able to correctly detect a negativity bias, which is stronger for tabloids. The method is resource-efficient in terms of manual labor when compared to manually constructed dictionaries, and requires a limited amount of computational power.

Klasifikace

  • Druh

    J<sub>ost</sub> - Ostatní články v recenzovaných periodicích

  • CEP obor

  • OECD FORD obor

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

  • Projekt

  • Návaznosti

Ostatní

  • Rok uplatnění

    2022

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název periodika

    Computational Communication Research [online]

  • ISSN

    2665-9085

  • e-ISSN

    1573-7462

  • Svazek periodika

    4

  • Číslo periodika v rámci svazku

    2

  • Stát vydavatele periodika

    NL - Nizozemsko

  • Počet stran výsledku

    39

  • Strana od-do

    424-462

  • Kód UT WoS článku

  • EID výsledku v databázi Scopus