All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Annotation scheme and evaluation: the case of OFFENSIVE language

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14410%2F23%3A00132528" target="_blank" >RIV/00216224:14410/23:00132528 - isvavai.cz</a>

  • Result on the web

    <a href="https://hrcak.srce.hr/clanak/444602" target="_blank" >https://hrcak.srce.hr/clanak/444602</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.31724/rihjj.49.1.8" target="_blank" >10.31724/rihjj.49.1.8</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Annotation scheme and evaluation: the case of OFFENSIVE language

  • Original language description

    The present paper focuses on the presentation and discussion of aspects of OFFENSIVE LANGUAGE linguistic annotation, including creation, annotation practice, curation, and evaluation of an OFFENSIVE LANGUAGE annotation taxonomy scheme, first proposed in Lewandowska-Tomaszczyk et al. (2021). An extended offensive language ontology comprising 17 categories, structured in terms of 4 hierarchical levels, has been shown to represent the encoding of the defined offensive language schema, trained in terms of non-contextual word embeddings – i.e., Word2Vec and Fast Text, and eventually juxtaposed to the data acquired by using a pairwise training and testing analysis for existing categories in the HateBERT model (Lewandowska-Tomaszczyk et al. submitted). The study reports on the annotation practice in WG 4.1.1. Incivility in media and social media in the context of COST Action CA 18209 European network for Web-centred linguistic data science (Nexus Linguarum) with 2 the INCEpTION tool (https://github.com/inception-project/inception) – a semantic annotation platform offering assistance in annotation. The results partly support the proposed ontology of explicit offence and positive implicitness types to provide more variance among widely recognized types of figurative language (e.g., metaphorical, metonymic, ironic, etc.). The use of the annotation system and the representation of linguistic data have also been evaluated in a series of the annotators’ comments, using a questionnaire method and in an open discussion. The annotation results and the questionnaire showed that for some of the categories, there was low or medium inter-annotator agreement, and it was more challenging for annotators to distinguish between category items than between aspect items, with the category items of offensive, insulting and abusive being the most difficult in this respect. The need for taxonomic simplification measures in this respect has been recognized for further annotation practices.

  • Czech name

  • Czech description

Classification

  • Type

    J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database

  • CEP classification

  • OECD FORD branch

    60203 - Linguistics

Result continuities

  • Project

  • Continuities

    I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Others

  • Publication year

    2023

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Name of the periodical

    Rasprave Instituta za Hrvatski Jezik i Jezikoslovlje

  • ISSN

    1331-6745

  • e-ISSN

    1849-0379

  • Volume of the periodical

    49

  • Issue of the periodical within the volume

    1

  • Country of publishing house

    HR - CROATIA

  • Number of pages

    21

  • Pages from-to

    155-175

  • UT code for WoS article

    001153374200005

  • EID of the result in the Scopus database

    2-s2.0-85177228943