Segmentation of Speech and Humming in Vocal Input

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F12%3A00196538" target="_blank" >RIV/68407700:21230/12:00196538 - isvavai.cz</a>
Výsledek na webu
<a href="http://www.radioeng.cz/fulltexts/2012/12_03_0923_0929.pdf" target="_blank" >http://www.radioeng.cz/fulltexts/2012/12_03_0923_0929.pdf</a>
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Segmentation of Speech and Humming in Vocal Input
Popis výsledku v původním jazyce
Non-verbal vocal interaction (NVVI) is an interaction method in which sounds other than speech produced by a human are used, such as humming. NVVI complements traditional speech recognition systems with continuous control. In order to combine the two approaches (e.g. "volume up, mmm") it is necessary to perform a speech/NVVI segmentation of the input sound signal. This paper presents two novel methods of speech and humming segmentation. The first method is based on classification of MFCC and RMS parameters using a neural network (MFCC method), while the other method computes volume changes in the signal (IAC method). The two methods are compared using a corpus collected from 13 speakers. The results indicate that the MFCC method outperforms IAC in terms of accuracy, precision, and recall.
Název v anglickém jazyce
Segmentation of Speech and Humming in Vocal Input
Popis výsledku anglicky
Non-verbal vocal interaction (NVVI) is an interaction method in which sounds other than speech produced by a human are used, such as humming. NVVI complements traditional speech recognition systems with continuous control. In order to combine the two approaches (e.g. "volume up, mmm") it is necessary to perform a speech/NVVI segmentation of the input sound signal. This paper presents two novel methods of speech and humming segmentation. The first method is based on classification of MFCC and RMS parameters using a neural network (MFCC method), while the other method computes volume changes in the signal (IAC method). The two methods are compared using a corpus collected from 13 speakers. The results indicate that the MFCC method outperforms IAC in terms of accuracy, precision, and recall.

Klasifikace

Druh
J<sub>x</sub> - Nezařazeno - Článek v odborném periodiku (Jimp, Jsc a Jost)
CEP obor
JC - Počítačový hardware a software
OECD FORD obor
—

Návaznosti výsledku

Projekt
—
Návaznosti
Z - Vyzkumny zamer (s odkazem do CEZ)

Ostatní

Rok uplatnění
2012
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Radioengineering
ISSN
1210-2512
e-ISSN
—
Svazek periodika
21
Číslo periodika v rámci svazku
3
Stát vydavatele periodika
CZ - Česká republika
Počet stran výsledku
7
Strana od-do
923-929
Kód UT WoS článku
000309253000021
EID výsledku v databázi Scopus
—

Podobné výsledky(10)

Corpus-based database of residual excitations used for speech reconstruction from MFCCs Rekonstrukce řečového signálu z melovských frekvenčních kepstrálních koeficientů Automatic detection of Parkinson's disease in running speech spoken in three different languages

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Segmentation of Speech and Humming in Vocal Input

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)