Rolling the DICE on Idiomaticity: How LLMs Fail to Grasp Context
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3AMQJSCA3P" target="_blank" >RIV/00216208:11320/25:MQJSCA3P - isvavai.cz</a>
Result on the web
<a href="http://arxiv.org/abs/2410.16069" target="_blank" >http://arxiv.org/abs/2410.16069</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.48550/arXiv.2410.16069" target="_blank" >10.48550/arXiv.2410.16069</a>
Alternative languages
Result language
angličtina
Original language name
Rolling the DICE on Idiomaticity: How LLMs Fail to Grasp Context
Original language description
Human processing of idioms relies on understanding the contextual sentences in which idioms occur, as well as language-intrinsic features such as frequency and speaker-intrinsic factors like familiarity. While LLMs have shown high performance on idiomaticity detection tasks, this success may be attributed to reasoning shortcuts in existing datasets. To this end, we construct a novel, controlled contrastive dataset designed to test whether LLMs can effectively use context to disambiguate idiomatic meaning. Additionally, we explore how collocational frequency and sentence probability influence model performance. Our findings reveal that LLMs often fail to resolve idiomaticity when it is required to attend to the surrounding context, and that models perform better on sentences that have higher likelihood. The collocational frequency of expressions also impacts performance. We make our code and dataset publicly available.
Czech name
—
Czech description
—
Classification
Type
O - Miscellaneous
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů