Unpacking lexical intertextuality: Vocabulary shared among texts
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F22%3A10452153" target="_blank" >RIV/00216208:11210/22:10452153 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1515/9783110763560-009" target="_blank" >https://doi.org/10.1515/9783110763560-009</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1515/9783110763560-009" target="_blank" >10.1515/9783110763560-009</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Unpacking lexical intertextuality: Vocabulary shared among texts
Popis výsledku v původním jazyce
This paper focuses on lexical intertextuality, namely the three following intertextual properties: 1) the number of word-types shared by two texts; 2) the number of word-types shared by all texts in a collection; 3) the number of wordtypes shared by equal-sized segments of a collection. We have observed that the relation between the number of texts and the number of shared types follows a power law; similar behavior can be seen if text borders are disregarded and the corpus is artificially divided into equal-sized segments. The number of shared types is proportional to the size of these sequences. We developed baseline models for the number of shared types, i.e. models predicting the number of types shared by texts if all tokens were randomly shuffled and evenly spread among texts. The comparison between the empirical data and the baseline model can be used for contrastive purposes, to compare the number of shared types in corpora of different languages.
Název v anglickém jazyce
Unpacking lexical intertextuality: Vocabulary shared among texts
Popis výsledku anglicky
This paper focuses on lexical intertextuality, namely the three following intertextual properties: 1) the number of word-types shared by two texts; 2) the number of word-types shared by all texts in a collection; 3) the number of wordtypes shared by equal-sized segments of a collection. We have observed that the relation between the number of texts and the number of shared types follows a power law; similar behavior can be seen if text borders are disregarded and the corpus is artificially divided into equal-sized segments. The number of shared types is proportional to the size of these sequences. We developed baseline models for the number of shared types, i.e. models predicting the number of types shared by texts if all tokens were randomly shuffled and evenly spread among texts. The comparison between the empirical data and the baseline model can be used for contrastive purposes, to compare the number of shared types in corpora of different languages.
Klasifikace
Druh
C - Kapitola v odborné knize
CEP obor
—
OECD FORD obor
60203 - Linguistics
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název knihy nebo sborníku
Quantitative Approaches to Universality and Individuality in Language
ISBN
978-3-11-076356-0
Počet stran výsledku
15
Strana od-do
101-115
Počet stran knihy
237
Název nakladatele
De Gruyter Mouton
Místo vydání
Deutschland
Kód UT WoS kapitoly
—