Quantifying Syntactic Complexity in Czech Texts: An Analysis of Mean Dependency Distance and Average Sentence Length Across Genres
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61988987%3A17250%2F24%3AA250385U" target="_blank" >RIV/61988987:17250/24:A250385U - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/00216208:11320/25:YDKMBR5R
Výsledek na webu
<a href="https://www.tandfonline.com/doi/full/10.1080/09296174.2024.2370459" target="_blank" >https://www.tandfonline.com/doi/full/10.1080/09296174.2024.2370459</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1080/09296174.2024.2370459" target="_blank" >10.1080/09296174.2024.2370459</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Quantifying Syntactic Complexity in Czech Texts: An Analysis of Mean Dependency Distance and Average Sentence Length Across Genres
Popis výsledku v původním jazyce
This study investigates the syntactic complexity of various text-types in the Czech language by analysing the mean dependency distance (MDD), a measure that quantifies the average distance between syntactic heads and their dependents within a sentence, and average sentence length (ASL). Using data from the SYN2020 corpus, a large and balanced collection of contemporary written Czech, we calculate the MDD and ASL for different text-types. Our findings reveal distinct patterns in the MDD and ASL values across genres, suggesting that syntactic complexity varies among different types of texts. We observe a clear distinction between fiction and non-fiction genres, with fiction exhibiting lower MDD and ASL values, indicating a more compact syntactic structure. Non-fiction genres, particularly scientific literature, display higher MDD and ASL values, reflecting more complex syntactic constructions. Journalistic texts, such as newspapers and magazines, fall between fiction and non-fiction in terms of MDD and ASL values. These results demonstrate the potential of MDD and ASL as quantitative measures for characterizing and differentiating text-types based on their syntactic complexity. Furthermore, our analysis contributes to a deeper understanding of the syntactic variations across diverse genres in the Czech language.
Název v anglickém jazyce
Quantifying Syntactic Complexity in Czech Texts: An Analysis of Mean Dependency Distance and Average Sentence Length Across Genres
Popis výsledku anglicky
This study investigates the syntactic complexity of various text-types in the Czech language by analysing the mean dependency distance (MDD), a measure that quantifies the average distance between syntactic heads and their dependents within a sentence, and average sentence length (ASL). Using data from the SYN2020 corpus, a large and balanced collection of contemporary written Czech, we calculate the MDD and ASL for different text-types. Our findings reveal distinct patterns in the MDD and ASL values across genres, suggesting that syntactic complexity varies among different types of texts. We observe a clear distinction between fiction and non-fiction genres, with fiction exhibiting lower MDD and ASL values, indicating a more compact syntactic structure. Non-fiction genres, particularly scientific literature, display higher MDD and ASL values, reflecting more complex syntactic constructions. Journalistic texts, such as newspapers and magazines, fall between fiction and non-fiction in terms of MDD and ASL values. These results demonstrate the potential of MDD and ASL as quantitative measures for characterizing and differentiating text-types based on their syntactic complexity. Furthermore, our analysis contributes to a deeper understanding of the syntactic variations across diverse genres in the Czech language.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
60203 - Linguistics
Návaznosti výsledku
Projekt
<a href="/cs/project/GA22-20632S" target="_blank" >GA22-20632S: Kvantitativní syntaktická stylistika současné psané češtiny</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Journal of Quantitative Linguistics
ISSN
0929-6174
e-ISSN
1744-5035
Svazek periodika
—
Číslo periodika v rámci svazku
3
Stát vydavatele periodika
GB - Spojené království Velké Británie a Severního Irska
Počet stran výsledku
14
Strana od-do
260-273
Kód UT WoS článku
001259286900001
EID výsledku v databázi Scopus
2-s2.0-85197575977