Extraction of Features for Lip-reading Using Autoencoders
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F14%3A%230003112" target="_blank" >RIV/46747885:24220/14:#0003112 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.1007/978-3-319-11581-8_26" target="_blank" >http://dx.doi.org/10.1007/978-3-319-11581-8_26</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-11581-8_26" target="_blank" >10.1007/978-3-319-11581-8_26</a>
Alternative languages
Result language
angličtina
Original language name
Extraction of Features for Lip-reading Using Autoencoders
Original language description
We study the incorporation of facial depth data in the task of isolated word visual speech recognition. We propose novel features based on unsupervised training of a single layer autoencoder. The features are extracted from both video and depth channelsobtained by Microsoft Kinect device. We perform all experiments on our database of 54 speakers, each uttering 50 words. We compare our autoencoder features to traditional methods such as DCT or PCA. The features are further processed by simplified variant of hierarchical linear discriminant analysis in order to capture the speech dynamics. The classification is performed using a multi-stream Hidden Markov Model for various combinations of audio, video, and depth channels. We also evaluate visual features in the join audio-video isolated word recognition in noisy environments.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
JC - Computer hardware and software
OECD FORD branch
—
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2014
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proc. of 16th International Conference, SPECOM
ISBN
978-3-319-11580-1
ISSN
0302-9743
e-ISSN
—
Number of pages
8
Pages from-to
209-216
Publisher name
Springer International Publishing
Place of publication
Berlín, Německo
Event location
Novi Sad, Srbsko
Event date
Jan 1, 2014
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000345576400026