Fine-Tuned Understanding: Enhancing Social Bot Detection with Transformer-Based Classification
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3A5EX7AX5X" target="_blank" >RIV/00216208:11320/25:5EX7AX5X - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85200814228&doi=10.1109%2fACCESS.2024.3440657&partnerID=40&md5=bdf14a3d9efe0606aae4a9f82e7f16ce" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85200814228&doi=10.1109%2fACCESS.2024.3440657&partnerID=40&md5=bdf14a3d9efe0606aae4a9f82e7f16ce</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ACCESS.2024.3440657" target="_blank" >10.1109/ACCESS.2024.3440657</a>
Alternative languages
Result language
angličtina
Original language name
Fine-Tuned Understanding: Enhancing Social Bot Detection with Transformer-Based Classification
Original language description
In recent years, the proliferation of online communication platforms and social media has given rise to a new wave of challenges, including the rapid spread of malicious bots. These bots, often programmed to impersonate human users, can infiltrate online communities, disseminate misinformation, and engage in various activities detrimental to the integrity of digital discourse. It is becoming more and more difficult to discern a text produced by deep neural networks from that created by humans. Transformer-based Pre-trained Language Models (PLMs) have recently shown excellent results in challenges involving natural language understanding (NLU). The suggested method is to employ an approach to detect bots at the tweet level by utilizing content and fine-tuning PLMs, to reduce the current threat. Building on the recent developments of the BERT (Bidirectional Encoder Representations from Transformers) and GPT-3, the suggested model employs a text embedding approach. This method offers a high-quality representation that can enhance the efficacy of detection. In addition, a Feedforward Neural Network (FNN) was used on top of the PLMs for final classification. The model was experimentally evaluated using the Twitter bot dataset. The strategy was tested using test data that came from the same distribution as their training set. The methodology in this paper involves preprocessing Twitter data, generating contextual embeddings using PLMs, and designing a classification model that learns to differentiate between human users and bots. Experiments were carried out adopting advanced Language Models to construct an encoding of the tweet to create a potential input vector on top of BERT and their variants. By employing Transformer-based models, we achieve significant improvements in bot detection F1-score (93%) compared to traditional methods such as Word2Vec and Global Vectors for Word Representation (Glove). Accuracy improvements ranging from 3% to 24% compared to baselines were achieved. The capability of GPT-4, an advanced Large Language Model (LLM), in interpreting bot-generated content is examined in this research. Additionally, explainable artificial intelligence (XAI) was utilized alongside transformer-based models for detecting bots on social media, enhancing the transparency and reliability of these models. © 2013 IEEE.
Czech name
—
Czech description
—
Classification
Type
J<sub>SC</sub> - Article in a specialist periodical, which is included in the SCOPUS database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
IEEE Access
ISSN
2169-3536
e-ISSN
—
Volume of the periodical
12
Issue of the periodical within the volume
2024
Country of publishing house
US - UNITED STATES
Number of pages
20
Pages from-to
118250-118269
UT code for WoS article
—
EID of the result in the Scopus database
2-s2.0-85200814228