Unveiling AI-Generated Financial Text: A Computational Approach Using Natural Language Processing and Generative Artificial Intelligence

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3A6JKFVB7W" target="_blank" >RIV/00216208:11320/25:6JKFVB7W - isvavai.cz</a>
Výsledek na webu
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85194225825&doi=10.3390%2fcomputation12050101&partnerID=40&md5=9f38b462793be7ea9d0479ff80726c5a" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85194225825&doi=10.3390%2fcomputation12050101&partnerID=40&md5=9f38b462793be7ea9d0479ff80726c5a</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/computation12050101" target="_blank" >10.3390/computation12050101</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Unveiling AI-Generated Financial Text: A Computational Approach Using Natural Language Processing and Generative Artificial Intelligence
Popis výsledku v původním jazyce
This study is an in-depth exploration of the nascent field of Natural Language Processing (NLP) and generative Artificial Intelligence (AI), and it concentrates on the vital task of distinguishing between human-generated text and content that has been produced by AI models. Particularly, this research pioneers the identification of financial text derived from AI models such as ChatGPT and paraphrasing tools like QuillBot. While our primary focus is on financial content, we have also pinpointed texts generated by paragraph rewriting tools and utilized ChatGPT for various contexts this multiclass identification was missing in previous studies. In this paper, we use a comprehensive feature extraction methodology that combines TF–IDF with Word2Vec, along with individual feature extraction methods. Importantly, combining a Random Forest model with Word2Vec results in impressive outcomes. Moreover, this study investigates the significance of the window size parameters in the Word2Vec approach, revealing that a window size of one produces outstanding scores across various metrics, including accuracy, precision, recall and the F1 measure, all reaching a notable value of 0.74. In addition to this, our developed model performs well in classification, attaining AUC values of 0.94 for the ‘GPT’ class; 0.77 for the ‘Quil’ class; and 0.89 for the ‘Real’ class. We also achieved an accuracy of 0.72, precision of 0.71, recall of 0.72, and F1 of 0.71 for our extended prepared dataset. This study contributes significantly to the evolving landscape of AI text identification, providing valuable insights and promising directions for future research. © 2024 by the authors.
Název v anglickém jazyce
Unveiling AI-Generated Financial Text: A Computational Approach Using Natural Language Processing and Generative Artificial Intelligence
Popis výsledku anglicky
This study is an in-depth exploration of the nascent field of Natural Language Processing (NLP) and generative Artificial Intelligence (AI), and it concentrates on the vital task of distinguishing between human-generated text and content that has been produced by AI models. Particularly, this research pioneers the identification of financial text derived from AI models such as ChatGPT and paraphrasing tools like QuillBot. While our primary focus is on financial content, we have also pinpointed texts generated by paragraph rewriting tools and utilized ChatGPT for various contexts this multiclass identification was missing in previous studies. In this paper, we use a comprehensive feature extraction methodology that combines TF–IDF with Word2Vec, along with individual feature extraction methods. Importantly, combining a Random Forest model with Word2Vec results in impressive outcomes. Moreover, this study investigates the significance of the window size parameters in the Word2Vec approach, revealing that a window size of one produces outstanding scores across various metrics, including accuracy, precision, recall and the F1 measure, all reaching a notable value of 0.74. In addition to this, our developed model performs well in classification, attaining AUC values of 0.94 for the ‘GPT’ class; 0.77 for the ‘Quil’ class; and 0.89 for the ‘Real’ class. We also achieved an accuracy of 0.72, precision of 0.71, recall of 0.72, and F1 of 0.71 for our extended prepared dataset. This study contributes significantly to the evolving landscape of AI text identification, providing valuable insights and promising directions for future research. © 2024 by the authors.

Klasifikace

Druh
J<sub>SC</sub> - Článek v periodiku v databázi SCOPUS
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
—
Návaznosti
—

Ostatní

Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Computation
ISSN
2079-3197
e-ISSN
—
Svazek periodika
12
Číslo periodika v rámci svazku
5
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
24
Strana od-do
1-24
Kód UT WoS článku
—
EID výsledku v databázi Scopus
2-s2.0-85194225825

Podobné výsledky(10)

Harnessing the Power of LLMs for Service Quality Assessment from User-Generated Content The Authenticity of ChatGPT’s Responses in the Tourism and Hospitality Sector: An Explorative Study on Human Perception of Artificial Intelligence A deep learning approach to building a framework for Urdu POS and NER

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Unveiling AI-Generated Financial Text: A Computational Approach Using Natural Language Processing and Generative Artificial Intelligence

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)