CoBaLD Annotation: the Enrichment of the Enhanced Universal Dependencies with the Semantical Pattern
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3AAA85X8D8" target="_blank" >RIV/00216208:11320/25:AA85X8D8 - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85195981759&partnerID=40&md5=59a315068616b517b0ef6676905171c1" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85195981759&partnerID=40&md5=59a315068616b517b0ef6676905171c1</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
CoBaLD Annotation: the Enrichment of the Enhanced Universal Dependencies with the Semantical Pattern
Original language description
The paper is devoted to the annotation format aimed at morphological, syntactic and especially semantic markup. The format combines the Enhanced UD morphosyntax and the Compreno semantic pattern, enriching the UD annotation with word meanings and labels for semantic relations between words. To adapt the Compreno semantics for the current purpose, we reduced the number of the semantic fields denoting lexical meanings by using hyperonym fields. Moreover, we used a generalized variant of the semantic relations as the original roles possess rather narrow meanings which makes them too numerous. Creating such a format demands the Compreno-to-UD morphosyntax conversion as well, which, in turn, demands solving the asymmetry problem between the models. The asymmetry concerns tokenization, lemmatization, POS-tagging, sets of grammatical features and dependency heads. To overcome this problem, the Compreno-to-UD converter was created. As an application, the work presents a 150,000 token corpus of English news annotated according to the standard. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Jt. Int. Conf. Comput. Linguist., Lang. Resour. Eval., LREC-COLING - Main Conf. Proc.
ISBN
978-249381410-4
ISSN
—
e-ISSN
—
Number of pages
11
Pages from-to
3422-3432
Publisher name
European Language Resources Association (ELRA)
Place of publication
—
Event location
Torino, Italia
Event date
Jan 1, 2025
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—