All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Fine-Tuned Understanding: Enhancing Social Bot Detection with Transformer-Based Classification

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3A5EX7AX5X" target="_blank" >RIV/00216208:11320/25:5EX7AX5X - isvavai.cz</a>

  • Result on the web

    <a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85200814228&doi=10.1109%2fACCESS.2024.3440657&partnerID=40&md5=bdf14a3d9efe0606aae4a9f82e7f16ce" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85200814228&doi=10.1109%2fACCESS.2024.3440657&partnerID=40&md5=bdf14a3d9efe0606aae4a9f82e7f16ce</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1109/ACCESS.2024.3440657" target="_blank" >10.1109/ACCESS.2024.3440657</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Fine-Tuned Understanding: Enhancing Social Bot Detection with Transformer-Based Classification

  • Original language description

    In recent years, the proliferation of online communication platforms and social media has given rise to a new wave of challenges, including the rapid spread of malicious bots. These bots, often programmed to impersonate human users, can infiltrate online communities, disseminate misinformation, and engage in various activities detrimental to the integrity of digital discourse. It is becoming more and more difficult to discern a text produced by deep neural networks from that created by humans. Transformer-based Pre-trained Language Models (PLMs) have recently shown excellent results in challenges involving natural language understanding (NLU). The suggested method is to employ an approach to detect bots at the tweet level by utilizing content and fine-tuning PLMs, to reduce the current threat. Building on the recent developments of the BERT (Bidirectional Encoder Representations from Transformers) and GPT-3, the suggested model employs a text embedding approach. This method offers a high-quality representation that can enhance the efficacy of detection. In addition, a Feedforward Neural Network (FNN) was used on top of the PLMs for final classification. The model was experimentally evaluated using the Twitter bot dataset. The strategy was tested using test data that came from the same distribution as their training set. The methodology in this paper involves preprocessing Twitter data, generating contextual embeddings using PLMs, and designing a classification model that learns to differentiate between human users and bots. Experiments were carried out adopting advanced Language Models to construct an encoding of the tweet to create a potential input vector on top of BERT and their variants. By employing Transformer-based models, we achieve significant improvements in bot detection F1-score (93%) compared to traditional methods such as Word2Vec and Global Vectors for Word Representation (Glove). Accuracy improvements ranging from 3% to 24% compared to baselines were achieved. The capability of GPT-4, an advanced Large Language Model (LLM), in interpreting bot-generated content is examined in this research. Additionally, explainable artificial intelligence (XAI) was utilized alongside transformer-based models for detecting bots on social media, enhancing the transparency and reliability of these models. © 2013 IEEE.

  • Czech name

  • Czech description

Classification

  • Type

    J<sub>SC</sub> - Article in a specialist periodical, which is included in the SCOPUS database

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

  • Continuities

Others

  • Publication year

    2024

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Name of the periodical

    IEEE Access

  • ISSN

    2169-3536

  • e-ISSN

  • Volume of the periodical

    12

  • Issue of the periodical within the volume

    2024

  • Country of publishing house

    US - UNITED STATES

  • Number of pages

    20

  • Pages from-to

    118250-118269

  • UT code for WoS article

  • EID of the result in the Scopus database

    2-s2.0-85200814228