GMM-Based Evaluation of Synthetic Speech Quality Using 2D Classification in Pleasure-Arousal Scale

The result's identifiers

Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F21%3A43961289" target="_blank" >RIV/49777513:23520/21:43961289 - isvavai.cz</a>
Result on the web
<a href="https://www.mdpi.com/2076-3417/11/1/2" target="_blank" >https://www.mdpi.com/2076-3417/11/1/2</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/app11010002" target="_blank" >10.3390/app11010002</a>

Alternative languages

Result language
angličtina
Original language name
GMM-Based Evaluation of Synthetic Speech Quality Using 2D Classification in Pleasure-Arousal Scale
Original language description
The paper focuses on the description of a system for the automatic evaluation of synthetic speech quality based on the Gaussian mixture model (GMM) classifier. The speech material originating from a real speaker is compared with synthesized material to determine similarities or differences between them. The final evaluation order is determined by distances in the Pleasure-Arousal (P-A) space between the original and synthetic speech using different synthesis and/or prosody manipulation methods implemented in the Czech text-to-speech system. The GMM models for continual 2D detection of P-A classes are trained using the sound/speech material from the databases without any relation to the original speech or the synthesized sentences. Preliminary and auxiliary analyses show a substantial influence of the number of mixtures, the number and type of the speech features used the size of the processed speech material, as well as the type of the database used for the creation of the GMMs on the P-A classification process and on the final evaluation result. The main evaluation experiments confirm the functionality of the system developed. The objective evaluation results obtained are principally correlated with the subjective ratings of human evaluators; however, partial differences were indicated, so a subsequent detailed investigation must be performed.
Czech name
—
Czech description
—

Classification

Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
20205 - Automation and control systems

Result continuities

Project
<a href="/en/project/GA19-19324S" target="_blank" >GA19-19324S: Fully Trainable Deep Neural Network Based Czech Text-to-Speech Synthesis</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

Name of the periodical
Applied Sciences
ISSN
2076-3417
e-ISSN
—
Volume of the periodical
11
Issue of the periodical within the volume
1
Country of publishing house
CH - SWITZERLAND
Number of pages
18
Pages from-to
1-18
UT code for WoS article
000605808900001
EID of the result in the Scopus database
2-s2.0-85098620235

Similar results(10)

Synthetic Speech Evaluation by Differential Maps in Pleasure-Arousal Space Synthetic Speech Evaluation by 2D GMM Classification in Pleasure-Arousal Scale Evaluation of Synthetic Speech by GMM-Based Continuous Detection of Emotional States

What are you looking for?

Quick search

Smart search

GMM-Based Evaluation of Synthetic Speech Quality Using 2D Classification in Pleasure-Arousal Scale

The result's identifiers

Alternative languages

Classification

Result continuities

Others

Data specific for result type

Similar results(10)

What are you looking for?

Quick search

Smart search

Result description

The result's identifiers

The result's identifiers

Alternative languages

Alternative languages

Classification

Classification

Result continuities

Result continuities

Others

Others

Data specific for result type

Data specific for result type

Similar results(10)