Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F23%3APU149427" target="_blank" >RIV/00216305:26230/23:PU149427 - isvavai.cz</a>
Result on the web
<a href="https://ieeexplore.ieee.org/document/10094795" target="_blank" >https://ieeexplore.ieee.org/document/10094795</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ICASSP49357.2023.10094795" target="_blank" >10.1109/ICASSP49357.2023.10094795</a>
Alternative languages
Result language
angličtina
Original language name
Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters
Original language description
Recently, the pre-trained Transformer models have received a rising interest in the field of speech processing thanks to their great success in various downstream tasks. However, most fine-tuning approaches update all the parameters of the pre-trained model, which becomes prohibitive as the model size grows and sometimes results in over- fitting on small datasets. In this paper, we conduct a comprehensive analysis of applying parameter-efficient transfer learning (PETL) methods to reduce the required learnable parameters for adapting to speaker verification tasks. Specifically, during the fine-tuning process, the pre-trained models are frozen, and only lightweight modules inserted in each Transformer block are trainable (a method known as adapters). Moreover, to boost the performance in a cross- language low-resource scenario, the Transformer model is further tuned on a large intermediate dataset before directly fine-tuning it on a small dataset. With updating fewer than 4% of parameters, (our proposed) PETL-based methods achieve comparable performances with full fine-tuning methods (Vox1-O: 0.55%, Vox1-E: 0.82%, Vox1-H:1.73%).
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISBN
978-1-7281-6327-7
ISSN
—
e-ISSN
—
Number of pages
5
Pages from-to
1-5
Publisher name
IEEE Signal Processing Society
Place of publication
Rhodes Island
Event location
Rhodes Island, Greece
Event date
Jun 4, 2023
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—