J-SNACS: Adposition and Case Supersenses for Japanese Joshi
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3APZ6QBRMM" target="_blank" >RIV/00216208:11320/25:PZ6QBRMM - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85195983863&partnerID=40&md5=da8cc7b3b4be90492f992a300dc0cb72" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85195983863&partnerID=40&md5=da8cc7b3b4be90492f992a300dc0cb72</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
J-SNACS: Adposition and Case Supersenses for Japanese Joshi
Original language description
Many languages use adpositions (prepositions or postpositions) to mark a variety of semantic relations, with different languages exhibiting both commonalities and idiosyncrasies in the relations grouped under the same lexeme. We present the first Japanese extension of the SNACS framework (Schneider et al., 2018), which has served as the basis for annotating adpositions in corpora from several languages. After establishing which of the set of particles (joshi) in Japanese qualify as case markers and adpositions as defined in SNACS, we annotate 10 chapters (≈10k tokens) of the Japanese translation of Le Petit Prince (The Little Prince), achieving high inter-annotator agreement. We find that, while a majority of the particles and their uses are captured by the existing and extended SNACS annotation guidelines from the previous work, some unique cases were observed. We also conduct experiments investigating the cross-lingual similarity of adposition and case marker supersenses, showing that the language-agnostic SNACS framework captures similarities not clearly observed in multilingual embedding space. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Jt. Int. Conf. Comput. Linguist., Lang. Resour. Eval., LREC-COLING - Main Conf. Proc.
ISBN
978-249381410-4
ISSN
—
e-ISSN
—
Number of pages
11
Pages from-to
9604-9614
Publisher name
European Language Resources Association (ELRA)
Place of publication
—
Event location
Torino, Italia
Event date
Jan 1, 2025
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—