Estimation of Average Information Content: Comparison of Impact of Contexts
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F19%3A10427043" target="_blank" >RIV/00216208:11320/19:10427043 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Estimation of Average Information Content: Comparison of Impact of Contexts
Popis výsledku v původním jazyce
In this paper, we compare Linear Mixed Effect Models (LMM) which utilise the predictors Average Information Content (IC) and frequency for the prediction of lengths of aspect-marked verbs. IC is the information which target elements convey to their context. Focusing on typologically diverse languages, we took as contexts dependency frames and n-grams, and found that IC estimated from n-grams outperforms IC estimated from dependency frames: the models which utilise IC from n-grams achieve high correlations between predicted and actual verbs’ lengths, while models which utilise IC form dependency frames perform poorly. Only in few languages we found prediction effects of IC.
Název v anglickém jazyce
Estimation of Average Information Content: Comparison of Impact of Contexts
Popis výsledku anglicky
In this paper, we compare Linear Mixed Effect Models (LMM) which utilise the predictors Average Information Content (IC) and frequency for the prediction of lengths of aspect-marked verbs. IC is the information which target elements convey to their context. Focusing on typologically diverse languages, we took as contexts dependency frames and n-grams, and found that IC estimated from n-grams outperforms IC estimated from dependency frames: the models which utilise IC from n-grams achieve high correlations between predicted and actual verbs’ lengths, while models which utilise IC form dependency frames perform poorly. Only in few languages we found prediction effects of IC.
Klasifikace
Druh
O - Ostatní výsledky
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2019
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů