Adapting Monolingual Models: Data can be Scarce when Language Similarity is High
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F21%3A10440894" target="_blank" >RIV/00216208:11320/21:10440894 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Adapting Monolingual Models: Data can be Scarce when Language Similarity is High
Original language description
For many (minority) languages, the resources needed to train large models are not available. We investigate the performance of zero-shot transfer learning with as little data as possible, and the influence of language similarity in this process. We retrain the lexical layers of four BERT-based models using data from two low-resource target language varieties, while the Transformer layers are independently fine-tuned on a POS-tagging task in the model's source language. By combining the new lexical layers and fine-tuned Transformer layers, we achieve high task performance for both target languages. With high language similarity, 10MB of data appears sufficient to achieve substantial monolingual transfer performance. Monolingual BERT-based models generally achieve higher downstream task performance after retraining the lexical layer than multilingual BERT, even when the target language is included in the multilingual model.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
ISBN
978-1-954085-54-1
ISSN
—
e-ISSN
—
Number of pages
7
Pages from-to
4901-4907
Publisher name
Association for Computational Linguistics
Place of publication
Stroudsburg
Event location
online
Event date
Aug 1, 2021
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—