Can we detect ChatGPT-generated texts in Czech and Slovak languages?
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F23%3A00132775" target="_blank" >RIV/00216224:14330/23:00132775 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Can we detect ChatGPT-generated texts in Czech and Slovak languages?
Popis výsledku v původním jazyce
The wide availability of generative AI exacerbates existing threats to society. It would not be easy even for linguists to tell whether the text we are reading was generated by a Large Language Model (LLM) or written by a human. Researchers have started developing tools that detect AI-generated content. This paper tested how two of these tools, Compilatio and GPT-2 Output Detector, performed with Czech, Slovak and English texts. There was only one tool somewhat capable of detecting AI-generated texts: Compilatio. Other tools were designed to work only with English texts. Hence, we also tested whether automatically translating the Czech and Slovak texts to English before uploading them to the detectors would have given any promising results. Ultimately, we showed that the texts generated by ChatGPT4 were less detectable than the texts generated by ChatGPT3.5.
Název v anglickém jazyce
Can we detect ChatGPT-generated texts in Czech and Slovak languages?
Popis výsledku anglicky
The wide availability of generative AI exacerbates existing threats to society. It would not be easy even for linguists to tell whether the text we are reading was generated by a Large Language Model (LLM) or written by a human. Researchers have started developing tools that detect AI-generated content. This paper tested how two of these tools, Compilatio and GPT-2 Output Detector, performed with Czech, Slovak and English texts. There was only one tool somewhat capable of detecting AI-generated texts: Compilatio. Other tools were designed to work only with English texts. Hence, we also tested whether automatically translating the Czech and Slovak texts to English before uploading them to the detectors would have given any promising results. Ultimately, we showed that the texts generated by ChatGPT4 were less detectable than the texts generated by ChatGPT3.5.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10200 - Computer and information sciences
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of the Sixteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2023
ISBN
9788026317937
ISSN
2336-4289
e-ISSN
—
Počet stran výsledku
9
Strana od-do
35-43
Název nakladatele
Tribun EU
Místo vydání
Brno
Místo konání akce
Kouty nad Desnou
Datum konání akce
1. 1. 2023
Typ akce podle státní příslušnosti
EUR - Evropská akce
Kód UT WoS článku
—