Harnessing the Power of LLMs for Service Quality Assessment from User-Generated Content
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F70883521%3A28120%2F24%3A63576618" target="_blank" >RIV/70883521:28120/24:63576618 - isvavai.cz</a>
Výsledek na webu
<a href="https://ieeexplore.ieee.org/document/10599371" target="_blank" >https://ieeexplore.ieee.org/document/10599371</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ACCESS.2024.3429290" target="_blank" >10.1109/ACCESS.2024.3429290</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Harnessing the Power of LLMs for Service Quality Assessment from User-Generated Content
Popis výsledku v původním jazyce
Adopting Large Language Models (LLMs) creates opportunities for organizations to increase efficiency, particularly in sentiment analysis and information extraction tasks. This study explores the efficiency of LLMs in real-world applications, focusing on sentiment analysis and service quality dimension extraction from user-generated content (UGC). For this purpose, we compare the performance of two LLMs (ChatGPT 3.5 and Claude 3) and three traditional NLP methods using two datasets of customer reviews (one in English and one in Persian). The results indicate that LLMs can achieve notable accuracy in information extraction (76% accuracy for ChatGPT and 68% for Claude 3) and sentiment analysis (substantial agreement with human raters for ChatGPT and moderate agreement with human raters for Claude 3), demonstrating an improvement compared to other AI models. However, challenges persist, including discrepancies between model predictions and human judgments and limitations in extracting specific dimensions from unstructured text. Whereas LLMs can streamline the SQ assessment process, human supervision remains essential to ensure reliability.
Název v anglickém jazyce
Harnessing the Power of LLMs for Service Quality Assessment from User-Generated Content
Popis výsledku anglicky
Adopting Large Language Models (LLMs) creates opportunities for organizations to increase efficiency, particularly in sentiment analysis and information extraction tasks. This study explores the efficiency of LLMs in real-world applications, focusing on sentiment analysis and service quality dimension extraction from user-generated content (UGC). For this purpose, we compare the performance of two LLMs (ChatGPT 3.5 and Claude 3) and three traditional NLP methods using two datasets of customer reviews (one in English and one in Persian). The results indicate that LLMs can achieve notable accuracy in information extraction (76% accuracy for ChatGPT and 68% for Claude 3) and sentiment analysis (substantial agreement with human raters for ChatGPT and moderate agreement with human raters for Claude 3), demonstrating an improvement compared to other AI models. However, challenges persist, including discrepancies between model predictions and human judgments and limitations in extracting specific dimensions from unstructured text. Whereas LLMs can streamline the SQ assessment process, human supervision remains essential to ensure reliability.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
50204 - Business and management
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
IEEE Access
ISSN
2169-3536
e-ISSN
2169-3536
Svazek periodika
2024
Číslo periodika v rámci svazku
12
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
13
Strana od-do
99755-99767
Kód UT WoS článku
001276352700001
EID výsledku v databázi Scopus
2-s2.0-85199109175