Effects of Large Multi-Speaker Models on the Quality of Neural Speech Synthesis
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F24%3A43974114" target="_blank" >RIV/49777513:23520/24:43974114 - isvavai.cz</a>
Result on the web
<a href="https://svk.fav.zcu.cz/download/proceedings_svk_2024.pdf" target="_blank" >https://svk.fav.zcu.cz/download/proceedings_svk_2024.pdf</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Effects of Large Multi-Speaker Models on the Quality of Neural Speech Synthesis
Original language description
These days, speech synthesis is usually performed by neural models (Tan et al., 2021).A neural speech synthesizer is dependent on a large number of parameters, whose values mustbe acquired during the process of model training. In many situations, the result of trainingcan be improved by fine-tuning a pre-trained model, i.e. using the parameter values of a modelwhich has been trained using different training data to initialize the parameters of the targetmodel before the training process begins (Zhang et al., 2023).In the field of speech synthesis, a pre-trained model is a speech synthesizer which hasbeen trained to synthesize the voice of another speaker. Furthermore, we can use a multi-speakerpre-trained model, which has been trained using speech recordings of multiple speakers, so itshould contain general knowledge about human speech.This paper describes how the number of speakers used to train a pre-trained model affectsthe quality of the final synthetic speech. We used a single-speaker model as well as two multispeakermodels for fine-tuning and we compared the obtained models in a listening test.
Czech name
—
Czech description
—
Classification
Type
O - Miscellaneous
CEP classification
—
OECD FORD branch
20205 - Automation and control systems
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů