Fine-tuning language models to predict item difficulty from wording
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F67985807%3A_____%2F24%3A00601131" target="_blank" >RIV/67985807:_____/24:00601131 - isvavai.cz</a>
Result on the web
<a href="https://www.psychometricsociety.org/sites/main/files/file-attachments/imps2024_abstracts.pdf?1720733361#page=347" target="_blank" >https://www.psychometricsociety.org/sites/main/files/file-attachments/imps2024_abstracts.pdf?1720733361#page=347</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Fine-tuning language models to predict item difficulty from wording
Original language description
ZÁKLADNÍ ÚDAJE: IMPS 2024 Abstracts. Prague: IMPS, 2024. s. 309-309. [IMPS 2024: Annual Meeting of the Psychometric Society. 16.07.2024-19.07.2024, Prague]. ABSTRAKT: In the domain of educational assessment, crafting items with robust psychometric properties poses significant challenges, especially when pretesting on a pilot population is not feasible. This necessitates reliable methods for estimating difficulty (and possibly other parameters) based solely on item wording. Traditionally, this involves extracting a wide array of theory-driven text features—ranging from basic descriptive statistics to readability indices—as predictors of item difficulty. To derive these text features, item wordings must first undergo extensive preprocessing, which results in a loss of crucial information (e.g., due to lemmatization). Recently, the focus has shifted towards predictors based on word embeddings, for instance, to better capture the semantics (Štěpánek et al., 2023). However, reflecting the advent of large language models (LLMs) such as transformers, exploring their adaptation for item difficulty prediction presents a promising opportunity. Although these models were originally trained on large corpora of textual data for tasks like masked text prediction, we can leverage the phenomenon of transfer learning and fine-tune these pre-trained LLMs for the task of item difficulty prediction. Thus, we may benefit from the nuanced language representation of modern LLMs without any loss of information along the way and without the need for any separate statistical model. In this work, we propose and test an innovative approach that utilizes the fine-tuning of pre-trained LLMs to estimate item difficulty from wording. By integrating these modern LLMs, we aim to achieve more accurate predictions of item characteristics, potentially aiding in the process of educational assessment development and evaluation.
Czech name
—
Czech description
—
Classification
Type
O - Miscellaneous
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/EH22_008%2F0004583" target="_blank" >EH22_008/0004583: Research of Excellence on Digital Technologies and Wellbeing</a><br>
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů