Study on the Use of Deep Neural Networks for Speech Activity Detection in Broadcast Recordings
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F16%3A00000471" target="_blank" >RIV/46747885:24220/16:00000471 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.5220/0005952700450051" target="_blank" >http://dx.doi.org/10.5220/0005952700450051</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.5220/0005952700450051" target="_blank" >10.5220/0005952700450051</a>
Alternative languages
Result language
angličtina
Original language name
Study on the Use of Deep Neural Networks for Speech Activity Detection in Broadcast Recordings
Original language description
This paper deals with the task of Speech Activity Detection (SAD). Our goal is to develop a SAD module suitable for a system for broadcast data transcription. Various Deep Neural Networks (DNNs) are evaluated for this purpose. Training of DNNs is performed using speech and non-speech data as well as artificial data created by mixing of both these data types at a desired level of Signal-to-Noise Ratio (SNR). The output from each DNN is smoothed using a decoder based on Weighted Finite State Transducers (WFSTs). The presented experimental results show that the use of the resulting SAD module leads to a) a slight improvement in transcription accuracy and b) a significant reduction in the computation time needed for transcription.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
JC - Computer hardware and software
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/TA04010199" target="_blank" >TA04010199: MULTILINMEDIA - Multilingual Multimedia Monitoring and Analyzing Platform</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Others
Publication year
2016
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proc. of 13th International Conference on Signal Processing and Multimedia Applications (SIGMAP 2016)
ISBN
978-989-758-196-0
ISSN
—
e-ISSN
—
Number of pages
7
Pages from-to
45-51
Publisher name
SciTePress
Place of publication
Lisabon, Portugalsko
Event location
Lisabon, Portugalsko
Event date
Jan 1, 2016
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000391091400004