Estimating cognitive text complexity with aggregation of quantile-based models
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3A3KLG9ANG" target="_blank" >RIV/00216208:11320/23:3KLG9ANG - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85175547918&doi=10.28995%2f2075-7182-2023-22-525-538&partnerID=40&md5=8eea1c304a8f0a5bae343687a9a47675" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85175547918&doi=10.28995%2f2075-7182-2023-22-525-538&partnerID=40&md5=8eea1c304a8f0a5bae343687a9a47675</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.28995/2075-7182-2023-22-525-538" target="_blank" >10.28995/2075-7182-2023-22-525-538</a>
Alternative languages
Result language
angličtina
Original language name
Estimating cognitive text complexity with aggregation of quantile-based models
Original language description
"In this paper, we introduce a novel approach to estimating the cognitive complexity of a text at different levels of language: phonetic, morphemic, lexical, and syntactic. The proposed method detects tokens with an abnormal frequency of complexity scores. The frequencies are taken from the empirical distributions calculated over the reference corpus of texts. We use the Russian Wikipedia for this purpose. Ensemble models are combined from individual models from different language levels. We created datasets of pairs of text fragments taken from social studies textbooks of different grades to train the ensembles. Empirical evidence shows that the proposed approach outperforms existing methods, such as readability indices, in estimating text complexity in terms of accuracy. The purpose of this study is to create one of the important components of the system of recommendation of scientific and educational content. © Dialogue 2023.All rights reserved."
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
"Komp'ut. Lingvist. Intellekt. Tehnol."
ISBN
—
ISSN
2221-7932
e-ISSN
—
Number of pages
15
Pages from-to
539-553
Publisher name
ABBYY PRODUCTION LLC
Place of publication
—
Event location
Cham
Event date
Jan 1, 2023
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—