Zero-shot Out-of-domain is No Joke: Lessons Learned in the VoiceMOS 2023 MOS Prediction Challenge

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F24%3A43972870" target="_blank" >RIV/49777513:23520/24:43972870 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.isca-archive.org/interspeech_2024/kunesova24_interspeech.pdf" target="_blank" >https://www.isca-archive.org/interspeech_2024/kunesova24_interspeech.pdf</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.21437/Interspeech.2024-400" target="_blank" >10.21437/Interspeech.2024-400</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Zero-shot Out-of-domain is No Joke: Lessons Learned in the VoiceMOS 2023 MOS Prediction Challenge
Popis výsledku v původním jazyce
This paper describes our team’s experiences in the VoiceMOS Challenge 2023 - a challenge centered around the evaluation of the quality of synthetic or noisy speech. Inspired by our success with an ensemble approach in the first VoiceMOS Challenge in 2022, we submitted an ensemble of four models this time, based on wav2vec 2.0, QuartzNet, CNN-RNN, and LDNet. This was enough to win one of the two tracks we participated in (Track 1b). However, post-challenge analysis shows that only two of the models offer a meaningful contribution in any of the VoiceMOS 2023 tracks, while the other two only degrade the ensemble’s overall performance. On the other hand, post-challenge results on Track 2 (singing voice conversion data) surpassed all our expectations. In the paper, we explain how we tried to deal with the new zero-shot out-of-domain scenarios, analyze the results, and discuss the lessons learned.
Název v anglickém jazyce
Zero-shot Out-of-domain is No Joke: Lessons Learned in the VoiceMOS 2023 MOS Prediction Challenge
Popis výsledku anglicky
This paper describes our team’s experiences in the VoiceMOS Challenge 2023 - a challenge centered around the evaluation of the quality of synthetic or noisy speech. Inspired by our success with an ensemble approach in the first VoiceMOS Challenge in 2022, we submitted an ensemble of four models this time, based on wav2vec 2.0, QuartzNet, CNN-RNN, and LDNet. This was enough to win one of the two tracks we participated in (Track 1b). However, post-challenge analysis shows that only two of the models offer a meaningful contribution in any of the VoiceMOS 2023 tracks, while the other two only degrade the ensemble’s overall performance. On the other hand, post-challenge results on Track 2 (singing voice conversion data) surpassed all our expectations. In the paper, we explain how we tried to deal with the new zero-shot out-of-domain scenarios, analyze the results, and discuss the lessons learned.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
20205 - Automation and control systems

Návaznosti výsledku

Projekt
<a href="/cs/project/GA22-27800S" target="_blank" >GA22-27800S: Využití vícemodálních Transformerů pro přirozenější hlasový dialog</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Interspeech 2024
ISBN
—
ISSN
2308-457X
e-ISSN
2958-1796
Počet stran výsledku
5
Strana od-do
4913-4917
Název nakladatele
International Speech Communication Association (ISCA)
Místo vydání
New York
Místo konání akce
Kos, Řecko
Datum konání akce
1. 9. 2024
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
001331850105005

Podobné výsledky(10)

Ensemble of Deep Neural Network Models for MOS Prediction Mutual Support of Data Modalities in the Task of Sign Language Recognition Recurrent Neural Networks for Dialogue State Tracking

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Zero-shot Out-of-domain is No Joke: Lessons Learned in the VoiceMOS 2023 MOS Prediction Challenge

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)