Unpacking lexical intertextuality: Vocabulary shared among texts
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F22%3A10452153" target="_blank" >RIV/00216208:11210/22:10452153 - isvavai.cz</a>
Result on the web
<a href="https://doi.org/10.1515/9783110763560-009" target="_blank" >https://doi.org/10.1515/9783110763560-009</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1515/9783110763560-009" target="_blank" >10.1515/9783110763560-009</a>
Alternative languages
Result language
angličtina
Original language name
Unpacking lexical intertextuality: Vocabulary shared among texts
Original language description
This paper focuses on lexical intertextuality, namely the three following intertextual properties: 1) the number of word-types shared by two texts; 2) the number of word-types shared by all texts in a collection; 3) the number of wordtypes shared by equal-sized segments of a collection. We have observed that the relation between the number of texts and the number of shared types follows a power law; similar behavior can be seen if text borders are disregarded and the corpus is artificially divided into equal-sized segments. The number of shared types is proportional to the size of these sequences. We developed baseline models for the number of shared types, i.e. models predicting the number of types shared by texts if all tokens were randomly shuffled and evenly spread among texts. The comparison between the empirical data and the baseline model can be used for contrastive purposes, to compare the number of shared types in corpora of different languages.
Czech name
—
Czech description
—
Classification
Type
C - Chapter in a specialist book
CEP classification
—
OECD FORD branch
60203 - Linguistics
Result continuities
Project
—
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2022
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Book/collection name
Quantitative Approaches to Universality and Individuality in Language
ISBN
978-3-11-076356-0
Number of pages of the result
15
Pages from-to
101-115
Number of pages of the book
237
Publisher name
De Gruyter Mouton
Place of publication
Deutschland
UT code for WoS chapter
—