Audio-Visual Speech Recognition in Noisy Audio Environments
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F13%3A%230002802" target="_blank" >RIV/46747885:24220/13:#0002802 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.1109/TSP.2013.6613979" target="_blank" >http://dx.doi.org/10.1109/TSP.2013.6613979</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/TSP.2013.6613979" target="_blank" >10.1109/TSP.2013.6613979</a>
Alternative languages
Result language
angličtina
Original language name
Audio-Visual Speech Recognition in Noisy Audio Environments
Original language description
It is a well-known fact that the visual part of speech can improve the resulting recognition rate mainly in noisy conditions. Main goal of this work is to find a set of visual features which would be possible to use in our audio-visual speech recognitionsystems. Discrete Cosine Transform (DCT) and Active Appearance Model (AAM) based visual features are extracted from visual speech signals, enhanced by a simplified variant of Hierarchical Linear Discriminant Analysis (HiLDA) and normalized across speakers. The visual features are then combined with standard MFCC audio features by the middle fusion method. The results from audio-visual speech recognition are compared with the results from experiments where the log-spectra minimum mean square error and multiband spectral subtraction methods for reducing additive noise in the audio signal are used.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
JC - Computer hardware and software
OECD FORD branch
—
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2013
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proc. of 36th International Conference on Telecommunications and Signal Processing (TSP 2013)
ISBN
9781479904044
ISSN
—
e-ISSN
—
Number of pages
4
Pages from-to
484-487
Publisher name
—
Place of publication
—
Event location
Itálie
Event date
Jan 1, 2013
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—