On the Correlation of Context-Aware Language Models With the Intelligibility of Polish Target Words to Czech Readers
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F21%3A10442316" target="_blank" >RIV/00216208:11210/21:10442316 - isvavai.cz</a>
Result on the web
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=cifSNDZXal" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=cifSNDZXal</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3389/fpsyg.2021.662277" target="_blank" >10.3389/fpsyg.2021.662277</a>
Alternative languages
Result language
angličtina
Original language name
On the Correlation of Context-Aware Language Models With the Intelligibility of Polish Target Words to Czech Readers
Original language description
This contribution seeks to provide a rational probabilistic explanation for the intelligibility of words in a genetically related language that is unknown to the reader, a phenomenon referred to as intercomprehension. In this research domain, linguistic distance, among other factors, was proved to correlate well with the mutual intelligibility of individual words. However, the role of context for the intelligibility of target words in sentences was subject to very few studies. To address this, we analyze data from web-based experiments in which Czech (CS) respondents were asked to translate highly predictable target words at the final position of Polish sentences. We compare correlations of target word intelligibility with data from 3-g language models (LMs) to their correlations with data obtained from context-aware LMs. More specifically, we evaluate two context-aware LM architectures: Long Short-Term Memory (LSTMs) that can, theoretically, take infinitely long-distance dependencies into account and Transformer-based LMs which can access the whole input sequence at the same time. We investigate how their use of context affects surprisal and its correlation with intelligibility.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
60203 - Linguistics
Result continuities
Project
—
Continuities
—
Others
Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Frontiers in Psychology [online]
ISSN
1664-1078
e-ISSN
—
Volume of the periodical
2021
Issue of the periodical within the volume
12
Country of publishing house
CH - SWITZERLAND
Number of pages
14
Pages from-to
1-14
UT code for WoS article
000673132500001
EID of the result in the Scopus database
2-s2.0-85110438748