Probing Multilingual Sentence Representations With X-Probe
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F19%3A10427071" target="_blank" >RIV/00216208:11320/19:10427071 - isvavai.cz</a>
Result on the web
<a href="https://www.aclweb.org/anthology/W19-4318" target="_blank" >https://www.aclweb.org/anthology/W19-4318</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Probing Multilingual Sentence Representations With X-Probe
Original language description
This paper extends the task of probing sentence representations for linguistic insight in a multilingual domain. In doing so, we make two contributions: first, we provide datasets for multilingual probing, derived from Wikipedia, in five languages, viz. English, French, German, Spanish and Russian. Second, we evaluate six sentence encoders for each language, each trained by mapping sentence representations to English sentence representations, using sentences in a parallel corpus. We discover that cross-lingually mapped representations are often better at retaining certain linguistic information than representations derived from English encoders trained on natural language inference (NLI) as a downstream task.
Czech name
—
Czech description
—
Classification
Type
O - Miscellaneous
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů