Morphological Tagging and Lemmatization of Spoken Corpora of Czech
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F23%3A10473322" target="_blank" >RIV/00216208:11210/23:10473322 - isvavai.cz</a>
Result on the web
<a href="https://doi.org/10.1007/978-3-031-40498-6_14" target="_blank" >https://doi.org/10.1007/978-3-031-40498-6_14</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-031-40498-6_14" target="_blank" >10.1007/978-3-031-40498-6_14</a>
Alternative languages
Result language
angličtina
Original language name
Morphological Tagging and Lemmatization of Spoken Corpora of Czech
Original language description
We describe the annotation of corpora of spoken Czech according to a new annotation standard valid since the publication of the SYN2020 corpus of written Czech. The standard distinguishes lemmas and sublemmas, assigns a new attribute to verb forms, deals with multi-word tokens in an appropriate way. In order to annotate the corpora of spoken Czech by the same standard, new training data for the annotation of spoken text was created and experiments with using both written and spoken data for training a neural tagger were performed.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
60203 - Linguistics
Result continuities
Project
<a href="/en/project/LM2023044" target="_blank" >LM2023044: Czech National Corpus</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Text, Speech, and Dialogue Lecture Notes in Computer Science
ISBN
978-3-031-40497-9
ISSN
0302-9743
e-ISSN
1611-3349
Number of pages
10
Pages from-to
154-163
Publisher name
Springer, Cham
Place of publication
Cham, Switzerland
Event location
Plzeň
Event date
Sep 4, 2023
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—