Comparison of One and Two-Level Architecture of the GMM-Based Speaker Age Classifier
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F16%3A43929603" target="_blank" >RIV/49777513:23520/16:43929603 - isvavai.cz</a>
Result on the web
<a href="http://ieeexplore.ieee.org/document/7760883/" target="_blank" >http://ieeexplore.ieee.org/document/7760883/</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/TSP.2016.7760883" target="_blank" >10.1109/TSP.2016.7760883</a>
Alternative languages
Result language
angličtina
Original language name
Comparison of One and Two-Level Architecture of the GMM-Based Speaker Age Classifier
Original language description
The paper describes an experiment using the Gaussian mixture models (GMM) for automatic classification of the speaker age and gender. The developed two-level architecture is compared with the standard one-level GMM classifier in more detail analysing the influence of different number of mixtures and different types of speech features used for GMM gender/age classification and also regarding the computational complexity in dependence on the applied number of used mixtures. Finally, the GMM classification accuracy is compared with the evaluation using the conventional listening test method. The obtained summary results of 92.3 % mean age classification accuracy for the proposed two-level architecture are better than those for the one-level standard architecture (78.7 %) as well as for evaluation by the listening test method (74.6 %). However, the computation complexity in two levels is about twice higher than in one level, either for GMM model creation or for classification phases.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
JD - Use of computers, robotics and its application
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/GA16-04420S" target="_blank" >GA16-04420S: Combining phonetic and corpus-based approaches to remedy disruptive effects in synthetic speech</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2016
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
2016 39TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP)
ISBN
978-1-5090-1288-6
ISSN
1805-5435
e-ISSN
—
Number of pages
4
Pages from-to
299-302
Publisher name
IEEE
Place of publication
NEW YORK, NY
Event location
Vienna, Austria
Event date
Jun 27, 2016
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000390164000064