Synthetic Speech Evaluation by 2D GMM Classification in Pleasure-Arousal Scale
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F20%3A43959712" target="_blank" >RIV/49777513:23520/20:43959712 - isvavai.cz</a>
Result on the web
<a href="https://ieeexplore.ieee.org/document/9163559" target="_blank" >https://ieeexplore.ieee.org/document/9163559</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/TSP49548.2020.9163559" target="_blank" >10.1109/TSP49548.2020.9163559</a>
Alternative languages
Result language
angličtina
Original language name
Synthetic Speech Evaluation by 2D GMM Classification in Pleasure-Arousal Scale
Original language description
The paper is focused on a description of a system for automatic evaluation of synthetic speech quality based on two-dimensional detection in the Pleasure-Arousal (P-A) scale. The original speech material of a speaker used for synthesis is compared with the synthesized one to find similarities/differences between them. For continual P-A detection, the Gaussian mixture model (GMM) classifier is used. The GMM models of the P-A classes are created and trained using the sound/speech material from the database labelled directly in the P-A scale without any relation with the used original speech or the tested sentences. The basic experiments confirm the principal functionality of the developed system. Additional analysis shows the great importance of the proper selection of the number of mixtures, and the used type of the sound/speech database for GMM models building. The obtained objective evaluation results are highly correlated with the subjective ratings of human evaluators.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
20205 - Automation and control systems
Result continuities
Project
<a href="/en/project/GA19-19324S" target="_blank" >GA19-19324S: Fully Trainable Deep Neural Network Based Czech Text-to-Speech Synthesis</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2020
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
2020 43nd International Conference on Telecommunications and Signal Processing (TSP)
ISBN
978-1-72816-376-5
ISSN
—
e-ISSN
—
Number of pages
4
Pages from-to
10-13
Publisher name
IEEE
Place of publication
New York
Event location
Milan, Italy
Event date
Jul 7, 2020
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000577106400003