On the Correlation of Context-Aware Language Models With the Intelligibility of Polish Target Words to Czech Readers

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F21%3A10442316" target="_blank" >RIV/00216208:11210/21:10442316 - isvavai.cz</a>
Výsledek na webu
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=cifSNDZXal" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=cifSNDZXal</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3389/fpsyg.2021.662277" target="_blank" >10.3389/fpsyg.2021.662277</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
On the Correlation of Context-Aware Language Models With the Intelligibility of Polish Target Words to Czech Readers
Popis výsledku v původním jazyce
This contribution seeks to provide a rational probabilistic explanation for the intelligibility of words in a genetically related language that is unknown to the reader, a phenomenon referred to as intercomprehension. In this research domain, linguistic distance, among other factors, was proved to correlate well with the mutual intelligibility of individual words. However, the role of context for the intelligibility of target words in sentences was subject to very few studies. To address this, we analyze data from web-based experiments in which Czech (CS) respondents were asked to translate highly predictable target words at the final position of Polish sentences. We compare correlations of target word intelligibility with data from 3-g language models (LMs) to their correlations with data obtained from context-aware LMs. More specifically, we evaluate two context-aware LM architectures: Long Short-Term Memory (LSTMs) that can, theoretically, take infinitely long-distance dependencies into account and Transformer-based LMs which can access the whole input sequence at the same time. We investigate how their use of context affects surprisal and its correlation with intelligibility.
Název v anglickém jazyce
On the Correlation of Context-Aware Language Models With the Intelligibility of Polish Target Words to Czech Readers
Popis výsledku anglicky
This contribution seeks to provide a rational probabilistic explanation for the intelligibility of words in a genetically related language that is unknown to the reader, a phenomenon referred to as intercomprehension. In this research domain, linguistic distance, among other factors, was proved to correlate well with the mutual intelligibility of individual words. However, the role of context for the intelligibility of target words in sentences was subject to very few studies. To address this, we analyze data from web-based experiments in which Czech (CS) respondents were asked to translate highly predictable target words at the final position of Polish sentences. We compare correlations of target word intelligibility with data from 3-g language models (LMs) to their correlations with data obtained from context-aware LMs. More specifically, we evaluate two context-aware LM architectures: Long Short-Term Memory (LSTMs) that can, theoretically, take infinitely long-distance dependencies into account and Transformer-based LMs which can access the whole input sequence at the same time. We investigate how their use of context affects surprisal and its correlation with intelligibility.

Klasifikace

Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
60203 - Linguistics

Návaznosti výsledku

Projekt
—
Návaznosti
—

Ostatní

Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Frontiers in Psychology [online]
ISSN
1664-1078
e-ISSN
—
Svazek periodika
2021
Číslo periodika v rámci svazku
12
Stát vydavatele periodika
CH - Švýcarská konfederace
Počet stran výsledku
14
Strana od-do
1-14
Kód UT WoS článku
000673132500001
EID výsledku v databázi Scopus
2-s2.0-85110438748

Podobné výsledky(10)

Cross-Lingual Dependency Parsing by POS-Guided Word Reordering Perplexity of n-gram and Dependency Language Models Text-in-Context: Token-Level Error Detection for Table-to-Text Generation

Co hledáte?

Rychlé hledání

Chytré vyhledávání

On the Correlation of Context-Aware Language Models With the Intelligibility of Polish Target Words to Czech Readers

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)