Automatic Poetic Metre Detection for Czech Verse
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68378068%3A_____%2F24%3A00598455" target="_blank" >RIV/68378068:_____/24:00598455 - isvavai.cz</a>
Alternative codes found
RIV/68407700:21240/24:00376478
Result on the web
<a href="https://ojs.utlib.ee/index.php/smp/article/view/24421" target="_blank" >https://ojs.utlib.ee/index.php/smp/article/view/24421</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.12697/smp.2024.11.1.02" target="_blank" >10.12697/smp.2024.11.1.02</a>
Alternative languages
Result language
angličtina
Original language name
Automatic Poetic Metre Detection for Czech Verse
Original language description
Metrical analysis of verse is an essential and challenging task in the research on versification consisting of analysing a poem and deciding which metre it is written in. Thanks to existing corpora, we can take advantage of data-driven approaches, which can be better suited to the specific versification problems at hand than rulebased systems. This work analyses the Czech accentual-syllabic verse and automatic metre assignment using the vast and annotated Corpus of Czech Verse. We define the problem as a sequence tagging task and approach it using a machine learning model and many different input data configurations. In comparison to this approach, we reimplement the existing data-driven system KVĚTA. Our results demonstrate that the bidirectional LSTM-CRF sequence tagging model, enhanced with syllable embeddings, significantly outperforms the existing KVĚTA system, with predictions achieving 99.61% syllable accuracy, 98.86% line accuracy, and 90.40% poem accuracy. The model also achieved competitive results with token embeddings. One of the most interesting findings is that the best results are obtained by inputting sequences representing whole poems instead of individual poem lines.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
60205 - Literary theory
Result continuities
Project
<a href="/en/project/TL05000288" target="_blank" >TL05000288: Analysis of thematicclusters from the field of current cultural and social categories and their application to literary works of Czech 19th and 20th century</a><br>
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Studia Metrica et Poetica
ISSN
2346-6901
e-ISSN
2346-691X
Volume of the periodical
11
Issue of the periodical within the volume
1
Country of publishing house
EE - ESTONIA
Number of pages
18
Pages from-to
44-61
UT code for WoS article
001312919300002
EID of the result in the Scopus database
2-s2.0-85203253700