Sign Pose-based Transformer for Word-level Sign Language Recognition
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F22%3A43966109" target="_blank" >RIV/49777513:23520/22:43966109 - isvavai.cz</a>
Result on the web
<a href="https://ieeexplore.ieee.org/document/9707552" target="_blank" >https://ieeexplore.ieee.org/document/9707552</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/WACVW54805.2022.00024" target="_blank" >10.1109/WACVW54805.2022.00024</a>
Alternative languages
Result language
angličtina
Original language name
Sign Pose-based Transformer for Word-level Sign Language Recognition
Original language description
In this paper we present a system for word-level sign language recognition based on the Transformer model. We aim at a solution with low computational cost, since we see great potential in the usage of such recognition system on hand-held devices. We base the recognition on the estimation of the pose of the human body in the form of 2D landmark locations. We introduce a robust pose normalization scheme which takes the signing space in consideration and processes the hand poses in a separate local coordinate system, independent on the body pose. We show experimentally the significant impact of this normalization on the accuracy of our proposed system. We introduce several augmentations of the body pose that further improve the accuracy, including a novel sequential joint rotation augmentation. With all the systems in place, we achieve state of the art top-1 results on the WLASL and LSA64 datasets. For WLASL, we are able to successfully recognize 63.18 % of sign recordings in the 100-gloss subset, which is a relative improvement of 5 % from the prior state of the art. For the 300-gloss subset, we achieve recognition rate of 43.78 % which is a relative improvement of 3.8 %. With the LSA64 dataset, we report test recognition accuracy of 100 %.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
20205 - Automation and control systems
Result continuities
Project
<a href="/en/project/LM2018101" target="_blank" >LM2018101: Digital Research Infrastructure for the Language Technologies, Arts and Humanities</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2022
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops
ISBN
978-1-66545-824-5
ISSN
2572-4398
e-ISSN
2690-621X
Number of pages
10
Pages from-to
182-191
Publisher name
IEEE
Place of publication
New York
Event location
Waikoloa, HI, Spojené státy
Event date
Jan 4, 2022
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000802187100020