Prague Dependency Treebank - Consolidated 1.0
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F20%3A10424449" target="_blank" >RIV/00216208:11320/20:10424449 - isvavai.cz</a>
Result on the web
<a href="https://www.aclweb.org/anthology/2020.lrec-1.641" target="_blank" >https://www.aclweb.org/anthology/2020.lrec-1.641</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Prague Dependency Treebank - Consolidated 1.0
Original language description
We present a richly annotated and genre-diversified language resource, the Prague Dependency Treebank-Consolidated 1.0 (PDT-C 1.0), the purpose of which is - as it always been the case for the family of the Prague Dependency Treebanks - to serve both as a training data for various types of NLP tasks as well as for linguistically-oriented research. PDT-C 1.0 contains four different datasets of Czech, uniformly annotated using the standard PDT scheme. The texts come from different sources: daily newspaper articles, Czech translation of the Wall Street Journal, transcribed dialogs and a small amount of user-generated, short, often non-standard language segments typed into a web translator. Altogether, the treebank contains around 180,000 sentences with their morphological, surface and deep syntactic annotation. The diversity of the texts and annotations should serve well the NLP applications as well as it is an invaluable resource for linguistic research, including comparative studies regarding texts of
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2020
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020)
ISBN
979-10-95546-34-4
ISSN
—
e-ISSN
—
Number of pages
11
Pages from-to
5208-5218
Publisher name
European Language Resources Association
Place of publication
Marseille, France
Event location
Marseille, France
Event date
May 11, 2020
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—