Lifting the Curse of Multilinguality by Pre-training Modular Transformers
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3AE4LEH4U7" target="_blank" >RIV/00216208:11320/23:E4LEH4U7 - isvavai.cz</a>
Result on the web
<a href="https://aclanthology.org/2022.naacl-main.255" target="_blank" >https://aclanthology.org/2022.naacl-main.255</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.18653/v1/2022.naacl-main.255" target="_blank" >10.18653/v1/2022.naacl-main.255</a>
Alternative languages
Result language
angličtina
Original language name
Lifting the Curse of Multilinguality by Pre-training Modular Transformers
Original language description
"Multilingual pre-trained models are known to suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages. We address this issue by introducing language-specific modules, which allows us to grow the total capacity of the model, while keeping the total number of trainable parameters per language constant. In contrast with prior work that learns language-specific components post-hoc, we pre-train the modules of our Cross-lingual Modular (X-Mod) models from the start. Our experiments on natural language inference, named entity recognition and question answering show that our approach not only mitigates the negative interference between languages, but also enables positive transfer, resulting in improved monolingual and cross-lingual performance. Furthermore, our approach enables adding languages post-hoc with no measurable drop in performance, no longer limiting the model usage to the set of pre-trained languages."
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies"
ISBN
978-1-955917-71-1
ISSN
—
e-ISSN
—
Number of pages
17
Pages from-to
3479-3495
Publisher name
arXiv
Place of publication
Seattle, USA
Event location
Seattle, USA
Event date
Jan 1, 2023
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—