Automatic Genre Classification of Czech Texts Based on Syntactic Functions

Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61988987%3A17250%2F24%3AA25038JG" target="_blank" >RIV/61988987:17250/24:A25038JG - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007/978-3-031-55917-4_13" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-031-55917-4_13</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-031-55917-4_13" target="_blank" >10.1007/978-3-031-55917-4_13</a>

Result language
angličtina
Original language name
Automatic Genre Classification of Czech Texts Based on Syntactic Functions
Original language description
Although there has been research conducted on text classification based on syntactic features for decades, the recent development of accurate automatic syntactic taggers has enabled scholars to apply the methods to much larger and more diverse datasets than before. This study aims to classify various text types in Czech language using relative frequencies of syntactic functions (as they are defined in the Prague Dependency Treebank (PDT)). A large balanced corpus of contemporary written Czech SYN2020 is used as the language material. The distances between texts are calculated by the Cosine Delta method and then hierarchical cluster analysis is performed. The results indicate that syntactic functions can contribute to automatic genre classification based on large empirical language data.
Czech name
—
Czech description
—

Project
<a href="/en/project/GA22-20632S" target="_blank" >GA22-20632S: Quantitative Syntactic Stylistics of Contemporary Written Czech</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Similar results(10)