Cross-lingual word analogies using linear transformations between semantic spaces
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F19%3A43955837" target="_blank" >RIV/49777513:23520/19:43955837 - isvavai.cz</a>
Result on the web
<a href="https://www.sciencedirect.com/science/article/pii/S0957417419304191" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0957417419304191</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.eswa.2019.06.021" target="_blank" >10.1016/j.eswa.2019.06.021</a>
Alternative languages
Result language
angličtina
Original language name
Cross-lingual word analogies using linear transformations between semantic spaces
Original language description
The ability to represent the meaning of words is one of the core parts of natural language understanding (NLU), with applications ranging across machine translation, summarization, question answering, information retrieval, etc. The need for reasoning in multilingual contexts and transferring knowledge in cross- lingual systems has given rise to cross-lingual semantic spaces, which learn representations of words across different languages. With growing attention to cross-lingual representations, it has became crucial to investigate proper evaluation schemes. The word-analogy-based evaluation has been one of the most common tools to evaluate linguistic relationships (such as male-female relationships or verb tenses) encoded in monolingual meaning representations. In this paper, we go beyond monolingual representations and generalize the word analogy task across languages to provide a new intrinsic evaluation tool for cross-lingual semantic spaces. Our approach allows examining cross-lingual projections and their impact on different aspects of meaning. It helps to discover potential weaknesses or advantages of cross-lingual methods before they are incorporated into different intelligent systems. We experiment with six languages within different language families, including English, German, Spanish, Italian, Czech, and Croatian. State-of-the-art monolingual semantic spaces are transformed into a shared space using dictionaries of word translations. We compare several linear transformations and rank them for experiments with monolingual (no transformation), bilingual (one semantic space is transformed to another), and multilingual (all semantic spaces are transformed onto English space) versions of semantic spaces. We show that tested linear transformations preserve relationships between words (word analogies) and lead to impressive results. We achieve average accuracy of 51.1%, 43.1%, and 38.2% for monolingual, bilingual, and multilingual semantic spaces, respectively.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/EF17_048%2F0007267" target="_blank" >EF17_048/0007267: Research and Development of Intelligent Components of Advanced Technologies for the Pilsen Metropolitan Area (InteCom)</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Expert Systems with Applications
ISSN
0957-4174
e-ISSN
—
Volume of the periodical
135
Issue of the periodical within the volume
NOV 30 2019
Country of publishing house
GB - UNITED KINGDOM
Number of pages
9
Pages from-to
287-295
UT code for WoS article
000480665800022
EID of the result in the Scopus database
2-s2.0-85067242443