MWEs in Treebanks: From Survey to Guidelines
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F16%3A10335519" target="_blank" >RIV/00216208:11320/16:10335519 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
MWEs in Treebanks: From Survey to Guidelines
Original language description
By means of an online survey, we have investigated ways in which various types of multiword expressions are annotated in existing treebanks. The results indicate that there is considerable variation in treatments across treebanks and thereby also, to some extent, across languages and across theoretical frameworks. The comparison is focused on the annotation of light verb constructions and verbal idioms. The survey shows that the light verb constructions either get special annotations as such, or are treated as ordinary verbs, while VP idioms are handled through different strategies. Based on insights from our investigation, we propose some general guidelines for annotating multiword expressions in treebanks. The recommendations address the following application-based needs: distinguishing MWEs from similar but compositional constructions; searching distinct types of MWEs in treebanks; awareness of literal and nonliteral meanings; and normalization of the MWE representation. The cross-lingually and cro
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/LD14117" target="_blank" >LD14117: Parsing and multi-word expressions. Towards linguistic precision and computational efficiency in natural language processing (PARSEME)</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2016
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016)
ISBN
978-2-9517408-9-1
ISSN
—
e-ISSN
—
Number of pages
8
Pages from-to
2323-2330
Publisher name
European Language Resources Association
Place of publication
Paris, France
Event location
Portorož, Slovenia
Event date
May 23, 2016
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—