Findings of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3AILHQEMR5" target="_blank" >RIV/00216208:11320/25:ILHQEMR5 - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85189645516&partnerID=40&md5=6c6cce0de13a8e1236cd42e3f9ab9ca3" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85189645516&partnerID=40&md5=6c6cce0de13a8e1236cd42e3f9ab9ca3</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Findings of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages
Original language description
This paper discusses the organisation and findings of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages. The shared task was split into the constrained and unconstrained tracks and involved solving either three or five problems for 12+ ancient and historical languages belonging to four language families and making use of six different scripts. There were 14 registrations in total, of which three teams participated in each track. Out of these six submissions, two systems were successful in the constrained setting and another two in the unconstrained setting, and four system description papers were submitted by different teams. The best average results for POS-tagging, lemmatisation and morphological feature prediction were 96.09%, 94.88% and 96.68% respectively. In the mask filling problem, the winning team could not achieve a higher average score across all 16 languages than 5.95% at the word level, which demonstrates the difficulty of this problem. At the character level, the best average result over 16 languages was 55.62%. © 2024 Association for Computational Linguistics.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
SIGTYP - Workshop Res. Comput. Linguist. Typology Multiling. NLP, Proc. Workshop
ISBN
979-889176071-4
ISSN
—
e-ISSN
—
Number of pages
13
Pages from-to
160-172
Publisher name
Association for Computational Linguistics (ACL)
Place of publication
—
Event location
St. Julian's, Malta
Event date
Jan 1, 2025
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—