Annotation of sentence structure: Capturing the relationship between clauses in Czech sentences
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F12%3A10129916" target="_blank" >RIV/00216208:11320/12:10129916 - isvavai.cz</a>
Result on the web
<a href="http://www.springerlink.com/content/p49382326524871h/" target="_blank" >http://www.springerlink.com/content/p49382326524871h/</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10579-011-9162-z" target="_blank" >10.1007/s10579-011-9162-z</a>
Alternative languages
Result language
angličtina
Original language name
Annotation of sentence structure: Capturing the relationship between clauses in Czech sentences
Original language description
The focus of this article is on the creation of a collection of sentences manually annotated with respect to their sentence structure. We show that the concept of linear segments-linguistically motivated units, which may be easily detected automatically-serves as a good basis for the identification of clauses in Czech. The segment annotation captures such relationships as subordination, coordination, apposition and parenthesis; based on segmentation charts, individual clauses forming a complex sentenceare identified. The annotation of a sentence structure enriches a dependency-based framework with explicit syntactic information on relations among complex units like clauses. We have gathered a collection of 3,444 sentences from the Prague Dependency Treebank, which were annotated with respect to their sentence structure (these sentences comprise 10,746 segments forming 6,341 clauses). The main purpose of the project is to gain a development data-promising results for Czech NLP tools (a
Czech name
—
Czech description
—
Classification
Type
J<sub>x</sub> - Unclassified - Peer-reviewed scientific article (Jimp, Jsc and Jost)
CEP classification
AI - Linguistics
OECD FORD branch
—
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2012
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Language Resources and Evaluation
ISSN
1574-020X
e-ISSN
—
Volume of the periodical
46
Issue of the periodical within the volume
1
Country of publishing house
NL - THE KINGDOM OF THE NETHERLANDS
Number of pages
12
Pages from-to
25-36
UT code for WoS article
000302289400002
EID of the result in the Scopus database
—