Velké jazykové modely prizmatem korpusové lingvistiky

Název projektu anglicky
Large language models through the prism of corpus linguistics
Anotace anglicky
This project aims to investigate the differences between human-generated texts and those produced by large language models (LLMs) using corpus linguistic methods, such as classical stylometry and multidimensional analysis, while also addressing aspects of perception, including human evaluations of persuasiveness. The project will focus on both English and Czech languages and the differences between them. We will also investigate the influence of the fact that large language models are primarily trained on English texts, as we assume that the conceptualizations obtained from English will influence the texts produced by the model in Czech. These topics will be investigated using a unique corpus of texts generated by various LLMs, which will be published and made accessible to the international academic community.

Kategorie VaV
ZV - Základní výzkum
OECD FORD - hlavní obor
60203 - Linguistics
OECD FORD - vedlejší obor
—
OECD FORD - další vedlejší obor
—
CEP - odpovídající obory <br>(dle <a href="http://www.vyzkum.cz/storage/att/E6EF7938F0E854BAE520AC119FB22E8D/Prevodnik_oboru_Frascati.pdf">převodníku</a>)
AI - Jazykověda

Důvěrnost údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Systémové označení dodávky dat
CEP25-GA0-GA-R
Datum dodání záznamu
21. 2. 2025

Podobné projekty(10)