Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy, and the Lexicon-Corpus Interface
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F24%3A10492849" target="_blank" >RIV/00216208:11320/24:10492849 - isvavai.cz</a>
Alternative codes found
RIV/00216208:11320/25:MHHLXEMS
Result on the web
<a href="https://aclanthology.org/2024.mwe-1.19.pdf" target="_blank" >https://aclanthology.org/2024.mwe-1.19.pdf</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy, and the Lexicon-Corpus Interface
Original language description
We present ongoing work towards defining a lexicon-corpus interface to serve as a benchmark in the representation of multiword expressions (of various parts of speech) in dedicated lexica and the linking of these entries to their corpus occurrences. The final aim is the harnessing of such resources for the automatic identification of multiword expressions in a text. The involvement of several natural languages aims at the universality of a solution not centered on a particular language, and also accommodating idiosyncrasies. Challenges in the lexicographic description of multiword expressions are discussed, the current status of lexica dedicated to this linguistic phenomenon is outlined, as well as the solution we envisage for creating an ecosystem of interlinked lexica and corpora containing and, respectively, annotated with multiword expressions.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the LREC-COLING 2024 Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD 2024)
ISBN
978-2-493-81420-3
ISSN
—
e-ISSN
—
Number of pages
7
Pages from-to
147-153
Publisher name
European Language Resources Association (ELRA)
Place of publication
Torino, Italy
Event location
Torino, Italy
Event date
May 25, 2024
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—