PDTSC 2.0 - Spoken Corpus with Rich Multi-layer Structural Annotation
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F17%3A10372163" target="_blank" >RIV/00216208:11320/17:10372163 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
PDTSC 2.0 - Spoken Corpus with Rich Multi-layer Structural Annotation
Original language description
We present a richly annotated spoken language resource, the Prague Dependency Treebank of Spoken Czech 2.0, the primary purpose of which is to serve for speech-related NLP tasks. The treebank features several novel annotation schemas close to the audio and transcript, and the morphological, syntactic and semantic annotation corresponds to the family of Prague Dependency Treebanks; it could thus be used also for linguistic studies, including comparative studies regarding text and speech. The most unique and novel feature is our approach to syntactic annotation, which differs from other similar corpora such as Treebank-3 [8] in that it does not attempt to impose syntactic structure over input, but it includes one more layer which edits the literal transcript to fluent Czech while keeping the original transcript explicitly aligned with the edited version. This allows the morphological, syntactic and semantic annotation to be deterministically and fully mapped back to the transcript and audio. It brings n
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2017
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
20th International Conference, TSD 2017 Prague, Czech Republic, August 27–31, 2017 Proceedings
ISBN
978-3-319-64205-5
ISSN
0302-9743
e-ISSN
neuvedeno
Number of pages
9
Pages from-to
129-137
Publisher name
Springer International Publishing
Place of publication
Cham / Heidelberg / New York
Event location
Praha, Czechia
Event date
Aug 27, 2016
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—