Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages

The result's identifiers

Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F21%3A10442279" target="_blank" >RIV/00216208:11320/21:10442279 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—

Alternative languages

Result language
angličtina
Original language name
Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages
Original language description
We explore the impact of leveraging the relatedness of languages that belong to the same family in NLP models using multilingual fine-tuning. We hypothesize and validate that multilingual fine-tuning of pre-trained language models can yield better performance on downstream NLP applications, compared to models fine-tuned on individual languages. A first of its kind detailed study is presented to track performance change as languages are added to a base language in a graded and greedy (in the sense of best boost of performance) manner; which reveals that careful selection of subset of related languages can significantly improve performance than utilizing all related languages. The Indo-Aryan (IA) language family is chosen for the study, the exact languages being Bengali, Gujarati, Hindi, Marathi, Oriya, Punjabi and Urdu. The script barrier is crossed by simple rule-based transliteration of the text of all languages to Devanagari. Experiments are performed on mBERT, IndicBERT, MuRIL and two RoBERTa-based LMs, the last two being pre-trained by us. Low resource languages, such as Oriya and Punjabi, are found to be the largest beneficiaries of multilingual fine-tuning. Textual Entailment, Entity Classification, Section Title Prediction, tasks of IndicGLUE and POS tagging form our test bed. Compared to monolingual fine tuning we get relative performance improvement of up to 150% in the downstream tasks. The surprise take-away is that for any language there is a particular combination of other languages which yields the best performance, and any additional language is in fact detrimental.
Czech name
—
Czech description
—

Classification

Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

Project
—
Continuities
—

Others

Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

Article name in the collection
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
ISBN
978-1-955917-09-4
ISSN
—
e-ISSN
—
Number of pages
12
Pages from-to
8584-8595
Publisher name
Association for Computational Linguistics
Place of publication
Stroudsburg
Event location
Punta Cana
Event date
Nov 7, 2021
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—

Similar results(10)

Pre-Trained Language-Meaning Models for Multilingual Parsing and Generation Investigating the Effect of Parallel Data in the Cross-Lingual Transfer for Vision-Language Encoders Language Representation Models for Low- and Medium-Resource Languages

What are you looking for?

Quick search

Smart search

Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages

The result's identifiers

Alternative languages

Classification

Result continuities

Others

Data specific for result type

Similar results(10)

What are you looking for?

Quick search

Smart search

Result description

The result's identifiers

The result's identifiers

Alternative languages

Alternative languages

Classification

Classification

Result continuities

Result continuities

Others

Others

Data specific for result type

Data specific for result type

Similar results(10)