Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

Annotation scheme and evaluation: the case of OFFENSIVE language

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14410%2F23%3A00132528" target="_blank" >RIV/00216224:14410/23:00132528 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://hrcak.srce.hr/clanak/444602" target="_blank" >https://hrcak.srce.hr/clanak/444602</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.31724/rihjj.49.1.8" target="_blank" >10.31724/rihjj.49.1.8</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    Annotation scheme and evaluation: the case of OFFENSIVE language

  • Popis výsledku v původním jazyce

    The present paper focuses on the presentation and discussion of aspects of OFFENSIVE LANGUAGE linguistic annotation, including creation, annotation practice, curation, and evaluation of an OFFENSIVE LANGUAGE annotation taxonomy scheme, first proposed in Lewandowska-Tomaszczyk et al. (2021). An extended offensive language ontology comprising 17 categories, structured in terms of 4 hierarchical levels, has been shown to represent the encoding of the defined offensive language schema, trained in terms of non-contextual word embeddings – i.e., Word2Vec and Fast Text, and eventually juxtaposed to the data acquired by using a pairwise training and testing analysis for existing categories in the HateBERT model (Lewandowska-Tomaszczyk et al. submitted). The study reports on the annotation practice in WG 4.1.1. Incivility in media and social media in the context of COST Action CA 18209 European network for Web-centred linguistic data science (Nexus Linguarum) with 2 the INCEpTION tool (https://github.com/inception-project/inception) – a semantic annotation platform offering assistance in annotation. The results partly support the proposed ontology of explicit offence and positive implicitness types to provide more variance among widely recognized types of figurative language (e.g., metaphorical, metonymic, ironic, etc.). The use of the annotation system and the representation of linguistic data have also been evaluated in a series of the annotators’ comments, using a questionnaire method and in an open discussion. The annotation results and the questionnaire showed that for some of the categories, there was low or medium inter-annotator agreement, and it was more challenging for annotators to distinguish between category items than between aspect items, with the category items of offensive, insulting and abusive being the most difficult in this respect. The need for taxonomic simplification measures in this respect has been recognized for further annotation practices.

  • Název v anglickém jazyce

    Annotation scheme and evaluation: the case of OFFENSIVE language

  • Popis výsledku anglicky

    The present paper focuses on the presentation and discussion of aspects of OFFENSIVE LANGUAGE linguistic annotation, including creation, annotation practice, curation, and evaluation of an OFFENSIVE LANGUAGE annotation taxonomy scheme, first proposed in Lewandowska-Tomaszczyk et al. (2021). An extended offensive language ontology comprising 17 categories, structured in terms of 4 hierarchical levels, has been shown to represent the encoding of the defined offensive language schema, trained in terms of non-contextual word embeddings – i.e., Word2Vec and Fast Text, and eventually juxtaposed to the data acquired by using a pairwise training and testing analysis for existing categories in the HateBERT model (Lewandowska-Tomaszczyk et al. submitted). The study reports on the annotation practice in WG 4.1.1. Incivility in media and social media in the context of COST Action CA 18209 European network for Web-centred linguistic data science (Nexus Linguarum) with 2 the INCEpTION tool (https://github.com/inception-project/inception) – a semantic annotation platform offering assistance in annotation. The results partly support the proposed ontology of explicit offence and positive implicitness types to provide more variance among widely recognized types of figurative language (e.g., metaphorical, metonymic, ironic, etc.). The use of the annotation system and the representation of linguistic data have also been evaluated in a series of the annotators’ comments, using a questionnaire method and in an open discussion. The annotation results and the questionnaire showed that for some of the categories, there was low or medium inter-annotator agreement, and it was more challenging for annotators to distinguish between category items than between aspect items, with the category items of offensive, insulting and abusive being the most difficult in this respect. The need for taxonomic simplification measures in this respect has been recognized for further annotation practices.

Klasifikace

  • Druh

    J<sub>imp</sub> - Článek v periodiku v databázi Web of Science

  • CEP obor

  • OECD FORD obor

    60203 - Linguistics

Návaznosti výsledku

  • Projekt

  • Návaznosti

    I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

  • Rok uplatnění

    2023

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název periodika

    Rasprave Instituta za Hrvatski Jezik i Jezikoslovlje

  • ISSN

    1331-6745

  • e-ISSN

    1849-0379

  • Svazek periodika

    49

  • Číslo periodika v rámci svazku

    1

  • Stát vydavatele periodika

    HR - Chorvatská republika

  • Počet stran výsledku

    21

  • Strana od-do

    155-175

  • Kód UT WoS článku

    001153374200005

  • EID výsledku v databázi Scopus

    2-s2.0-85177228943