Fine-tuning language models to predict item difficulty from wording
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F67985807%3A_____%2F24%3A00601131" target="_blank" >RIV/67985807:_____/24:00601131 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.psychometricsociety.org/sites/main/files/file-attachments/imps2024_abstracts.pdf?1720733361#page=347" target="_blank" >https://www.psychometricsociety.org/sites/main/files/file-attachments/imps2024_abstracts.pdf?1720733361#page=347</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Fine-tuning language models to predict item difficulty from wording
Popis výsledku v původním jazyce
ZÁKLADNÍ ÚDAJE: IMPS 2024 Abstracts. Prague: IMPS, 2024. s. 309-309. [IMPS 2024: Annual Meeting of the Psychometric Society. 16.07.2024-19.07.2024, Prague]. ABSTRAKT: In the domain of educational assessment, crafting items with robust psychometric properties poses significant challenges, especially when pretesting on a pilot population is not feasible. This necessitates reliable methods for estimating difficulty (and possibly other parameters) based solely on item wording. Traditionally, this involves extracting a wide array of theory-driven text features—ranging from basic descriptive statistics to readability indices—as predictors of item difficulty. To derive these text features, item wordings must first undergo extensive preprocessing, which results in a loss of crucial information (e.g., due to lemmatization). Recently, the focus has shifted towards predictors based on word embeddings, for instance, to better capture the semantics (Štěpánek et al., 2023). However, reflecting the advent of large language models (LLMs) such as transformers, exploring their adaptation for item difficulty prediction presents a promising opportunity. Although these models were originally trained on large corpora of textual data for tasks like masked text prediction, we can leverage the phenomenon of transfer learning and fine-tune these pre-trained LLMs for the task of item difficulty prediction. Thus, we may benefit from the nuanced language representation of modern LLMs without any loss of information along the way and without the need for any separate statistical model. In this work, we propose and test an innovative approach that utilizes the fine-tuning of pre-trained LLMs to estimate item difficulty from wording. By integrating these modern LLMs, we aim to achieve more accurate predictions of item characteristics, potentially aiding in the process of educational assessment development and evaluation.
Název v anglickém jazyce
Fine-tuning language models to predict item difficulty from wording
Popis výsledku anglicky
ZÁKLADNÍ ÚDAJE: IMPS 2024 Abstracts. Prague: IMPS, 2024. s. 309-309. [IMPS 2024: Annual Meeting of the Psychometric Society. 16.07.2024-19.07.2024, Prague]. ABSTRAKT: In the domain of educational assessment, crafting items with robust psychometric properties poses significant challenges, especially when pretesting on a pilot population is not feasible. This necessitates reliable methods for estimating difficulty (and possibly other parameters) based solely on item wording. Traditionally, this involves extracting a wide array of theory-driven text features—ranging from basic descriptive statistics to readability indices—as predictors of item difficulty. To derive these text features, item wordings must first undergo extensive preprocessing, which results in a loss of crucial information (e.g., due to lemmatization). Recently, the focus has shifted towards predictors based on word embeddings, for instance, to better capture the semantics (Štěpánek et al., 2023). However, reflecting the advent of large language models (LLMs) such as transformers, exploring their adaptation for item difficulty prediction presents a promising opportunity. Although these models were originally trained on large corpora of textual data for tasks like masked text prediction, we can leverage the phenomenon of transfer learning and fine-tune these pre-trained LLMs for the task of item difficulty prediction. Thus, we may benefit from the nuanced language representation of modern LLMs without any loss of information along the way and without the need for any separate statistical model. In this work, we propose and test an innovative approach that utilizes the fine-tuning of pre-trained LLMs to estimate item difficulty from wording. By integrating these modern LLMs, we aim to achieve more accurate predictions of item characteristics, potentially aiding in the process of educational assessment development and evaluation.
Klasifikace
Druh
O - Ostatní výsledky
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/EH22_008%2F0004583" target="_blank" >EH22_008/0004583: Excelentní výzkum v oblasti digitálních technologií a wellbeingu</a><br>
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů