Estimating cognitive text complexity with aggregation of quantile-based models
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3A3KLG9ANG" target="_blank" >RIV/00216208:11320/23:3KLG9ANG - isvavai.cz</a>
Výsledek na webu
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85175547918&doi=10.28995%2f2075-7182-2023-22-525-538&partnerID=40&md5=8eea1c304a8f0a5bae343687a9a47675" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85175547918&doi=10.28995%2f2075-7182-2023-22-525-538&partnerID=40&md5=8eea1c304a8f0a5bae343687a9a47675</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.28995/2075-7182-2023-22-525-538" target="_blank" >10.28995/2075-7182-2023-22-525-538</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Estimating cognitive text complexity with aggregation of quantile-based models
Popis výsledku v původním jazyce
"In this paper, we introduce a novel approach to estimating the cognitive complexity of a text at different levels of language: phonetic, morphemic, lexical, and syntactic. The proposed method detects tokens with an abnormal frequency of complexity scores. The frequencies are taken from the empirical distributions calculated over the reference corpus of texts. We use the Russian Wikipedia for this purpose. Ensemble models are combined from individual models from different language levels. We created datasets of pairs of text fragments taken from social studies textbooks of different grades to train the ensembles. Empirical evidence shows that the proposed approach outperforms existing methods, such as readability indices, in estimating text complexity in terms of accuracy. The purpose of this study is to create one of the important components of the system of recommendation of scientific and educational content. © Dialogue 2023.All rights reserved."
Název v anglickém jazyce
Estimating cognitive text complexity with aggregation of quantile-based models
Popis výsledku anglicky
"In this paper, we introduce a novel approach to estimating the cognitive complexity of a text at different levels of language: phonetic, morphemic, lexical, and syntactic. The proposed method detects tokens with an abnormal frequency of complexity scores. The frequencies are taken from the empirical distributions calculated over the reference corpus of texts. We use the Russian Wikipedia for this purpose. Ensemble models are combined from individual models from different language levels. We created datasets of pairs of text fragments taken from social studies textbooks of different grades to train the ensembles. Empirical evidence shows that the proposed approach outperforms existing methods, such as readability indices, in estimating text complexity in terms of accuracy. The purpose of this study is to create one of the important components of the system of recommendation of scientific and educational content. © Dialogue 2023.All rights reserved."
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
"Komp'ut. Lingvist. Intellekt. Tehnol."
ISBN
—
ISSN
2221-7932
e-ISSN
—
Počet stran výsledku
15
Strana od-do
539-553
Název nakladatele
ABBYY PRODUCTION LLC
Místo vydání
—
Místo konání akce
Cham
Datum konání akce
1. 1. 2023
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—