All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Automatic Genre Classification of Czech Texts Based on Syntactic Functions

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61988987%3A17250%2F24%3AA25038JG" target="_blank" >RIV/61988987:17250/24:A25038JG - isvavai.cz</a>

  • Result on the web

    <a href="https://link.springer.com/chapter/10.1007/978-3-031-55917-4_13" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-031-55917-4_13</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1007/978-3-031-55917-4_13" target="_blank" >10.1007/978-3-031-55917-4_13</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Automatic Genre Classification of Czech Texts Based on Syntactic Functions

  • Original language description

    Although there has been research conducted on text classification based on syntactic features for decades, the recent development of accurate automatic syntactic taggers has enabled scholars to apply the methods to much larger and more diverse datasets than before. This study aims to classify various text types in Czech language using relative frequencies of syntactic functions (as they are defined in the Prague Dependency Treebank (PDT)). A large balanced corpus of contemporary written Czech SYN2020 is used as the language material. The distances between texts are calculated by the Cosine Delta method and then hierarchical cluster analysis is performed. The results indicate that syntactic functions can contribute to automatic genre classification based on large empirical language data.

  • Czech name

  • Czech description

Classification

  • Type

    C - Chapter in a specialist book

  • CEP classification

  • OECD FORD branch

    60203 - Linguistics

Result continuities

  • Project

    <a href="/en/project/GA22-20632S" target="_blank" >GA22-20632S: Quantitative Syntactic Stylistics of Contemporary Written Czech</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2024

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Book/collection name

    New Frontiers in Textual Data Analysis

  • ISBN

    978-3-031-55916-7

  • Number of pages of the result

    10

  • Pages from-to

    163-172

  • Number of pages of the book

    396

  • Publisher name

    Springer

  • Place of publication

    Cham

  • UT code for WoS chapter