Automatic Genre Classification of Czech Texts Based on Syntactic Functions
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61988987%3A17250%2F24%3AA25038JG" target="_blank" >RIV/61988987:17250/24:A25038JG - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007/978-3-031-55917-4_13" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-031-55917-4_13</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-031-55917-4_13" target="_blank" >10.1007/978-3-031-55917-4_13</a>
Alternative languages
Result language
angličtina
Original language name
Automatic Genre Classification of Czech Texts Based on Syntactic Functions
Original language description
Although there has been research conducted on text classification based on syntactic features for decades, the recent development of accurate automatic syntactic taggers has enabled scholars to apply the methods to much larger and more diverse datasets than before. This study aims to classify various text types in Czech language using relative frequencies of syntactic functions (as they are defined in the Prague Dependency Treebank (PDT)). A large balanced corpus of contemporary written Czech SYN2020 is used as the language material. The distances between texts are calculated by the Cosine Delta method and then hierarchical cluster analysis is performed. The results indicate that syntactic functions can contribute to automatic genre classification based on large empirical language data.
Czech name
—
Czech description
—
Classification
Type
C - Chapter in a specialist book
CEP classification
—
OECD FORD branch
60203 - Linguistics
Result continuities
Project
<a href="/en/project/GA22-20632S" target="_blank" >GA22-20632S: Quantitative Syntactic Stylistics of Contemporary Written Czech</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Book/collection name
New Frontiers in Textual Data Analysis
ISBN
978-3-031-55916-7
Number of pages of the result
10
Pages from-to
163-172
Number of pages of the book
396
Publisher name
Springer
Place of publication
Cham
UT code for WoS chapter
—