Ensemble of Deep Neural Network Models for MOS Prediction

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F23%3A43968754" target="_blank" >RIV/49777513:23520/23:43968754 - isvavai.cz</a>
Výsledek na webu
<a href="https://ieeexplore.ieee.org/document/10095676" target="_blank" >https://ieeexplore.ieee.org/document/10095676</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ICASSP49357.2023.10095676" target="_blank" >10.1109/ICASSP49357.2023.10095676</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Ensemble of Deep Neural Network Models for MOS Prediction
Popis výsledku v původním jazyce
Automatic evaluation of the quality of synthetic speech has the potential to serve as a cheaper and less time-consuming alternative to standard listening tests. In this paper, we present our contribution to the ongoing research: a system for automatic prediction of the mean opinion score (MOS) given by human listeners. The system was specifically developed for the recent VoiceMOS Challenge. Following the success of fusion systems in similar challenges, our contribution is an ensemble that interpolates the outputs of seven different models: four different wav2vec models, a CNN-RNN model, QuartzNet, and the LDNet baseline. During the VoiceMOS challenge, our system achieved the second-best utterance-level MSE of 0.171 and ranged from 2nd to 8th place among all 22 participating teams in terms of other evaluation metrics.
Název v anglickém jazyce
Ensemble of Deep Neural Network Models for MOS Prediction
Popis výsledku anglicky
Automatic evaluation of the quality of synthetic speech has the potential to serve as a cheaper and less time-consuming alternative to standard listening tests. In this paper, we present our contribution to the ongoing research: a system for automatic prediction of the mean opinion score (MOS) given by human listeners. The system was specifically developed for the recent VoiceMOS Challenge. Following the success of fusion systems in similar challenges, our contribution is an ensemble that interpolates the outputs of seven different models: four different wav2vec models, a CNN-RNN model, QuartzNet, and the LDNet baseline. During the VoiceMOS challenge, our system achieved the second-best utterance-level MSE of 0.171 and ranged from 2nd to 8th place among all 22 participating teams in terms of other evaluation metrics.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
20205 - Automation and control systems

Návaznosti výsledku

Projekt
<a href="/cs/project/GA22-27800S" target="_blank" >GA22-27800S: Využití vícemodálních Transformerů pro přirozenější hlasový dialog</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
ISBN
978-1-72816-327-7
ISSN
1520-6149
e-ISSN
2379-190X
Počet stran výsledku
5
Strana od-do
—
Název nakladatele
IEEE
Místo vydání
New York
Místo konání akce
Rhodes Island, Greece
Datum konání akce
4. 6. 2023
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Zero-shot Out-of-domain is No Joke: Lessons Learned in the VoiceMOS 2023 MOS Prediction Challenge MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module Použití automatického rozpoznávání řeči pro hodnocení systémů pro převod psaného textu na řeč

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Ensemble of Deep Neural Network Models for MOS Prediction

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)