Sign Pose-based Transformer for Word-level Sign Language Recognition

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F22%3A43966109" target="_blank" >RIV/49777513:23520/22:43966109 - isvavai.cz</a>
Výsledek na webu
<a href="https://ieeexplore.ieee.org/document/9707552" target="_blank" >https://ieeexplore.ieee.org/document/9707552</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/WACVW54805.2022.00024" target="_blank" >10.1109/WACVW54805.2022.00024</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Sign Pose-based Transformer for Word-level Sign Language Recognition
Popis výsledku v původním jazyce
In this paper we present a system for word-level sign language recognition based on the Transformer model. We aim at a solution with low computational cost, since we see great potential in the usage of such recognition system on hand-held devices. We base the recognition on the estimation of the pose of the human body in the form of 2D landmark locations. We introduce a robust pose normalization scheme which takes the signing space in consideration and processes the hand poses in a separate local coordinate system, independent on the body pose. We show experimentally the significant impact of this normalization on the accuracy of our proposed system. We introduce several augmentations of the body pose that further improve the accuracy, including a novel sequential joint rotation augmentation. With all the systems in place, we achieve state of the art top-1 results on the WLASL and LSA64 datasets. For WLASL, we are able to successfully recognize 63.18 % of sign recordings in the 100-gloss subset, which is a relative improvement of 5 % from the prior state of the art. For the 300-gloss subset, we achieve recognition rate of 43.78 % which is a relative improvement of 3.8 %. With the LSA64 dataset, we report test recognition accuracy of 100 %.
Název v anglickém jazyce
Sign Pose-based Transformer for Word-level Sign Language Recognition
Popis výsledku anglicky
In this paper we present a system for word-level sign language recognition based on the Transformer model. We aim at a solution with low computational cost, since we see great potential in the usage of such recognition system on hand-held devices. We base the recognition on the estimation of the pose of the human body in the form of 2D landmark locations. We introduce a robust pose normalization scheme which takes the signing space in consideration and processes the hand poses in a separate local coordinate system, independent on the body pose. We show experimentally the significant impact of this normalization on the accuracy of our proposed system. We introduce several augmentations of the body pose that further improve the accuracy, including a novel sequential joint rotation augmentation. With all the systems in place, we achieve state of the art top-1 results on the WLASL and LSA64 datasets. For WLASL, we are able to successfully recognize 63.18 % of sign recordings in the 100-gloss subset, which is a relative improvement of 5 % from the prior state of the art. For the 300-gloss subset, we achieve recognition rate of 43.78 % which is a relative improvement of 3.8 %. With the LSA64 dataset, we report test recognition accuracy of 100 %.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
20205 - Automation and control systems

Návaznosti výsledku

Projekt
<a href="/cs/project/LM2018101" target="_blank" >LM2018101: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Proceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops
ISBN
978-1-66545-824-5
ISSN
2572-4398
e-ISSN
2690-621X
Počet stran výsledku
10
Strana od-do
182-191
Název nakladatele
IEEE
Místo vydání
New York
Místo konání akce
Waikoloa, HI, Spojené státy
Datum konání akce
4. 1. 2022
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000802187100020

Podobné výsledky(10)

One Model is Not Enough: Ensembles for Isolated Sign Language Recognition MediaPipe and Its Suitability for Sign Language Recognition Mutual Support of Data Modalities in the Task of Sign Language Recognition

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Sign Pose-based Transformer for Word-level Sign Language Recognition

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)