Understanding the Limits of 2D Skeletons for Action Recognition

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F21%3A00118833" target="_blank" >RIV/00216224:14330/21:00118833 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/article/10.1007/s00530-021-00754-0" target="_blank" >https://link.springer.com/article/10.1007/s00530-021-00754-0</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s00530-021-00754-0" target="_blank" >10.1007/s00530-021-00754-0</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Understanding the Limits of 2D Skeletons for Action Recognition
Popis výsledku v původním jazyce
With the development of motion capture technologies, 3D action recognition has become a popular task that finds great applicability in many areas, such as augmented reality, human–computer interaction, sports, or healthcare. On the other hand, the acquisition of 3D human skeleton data is an expensive and time-consuming process, mainly due to the high costs of capturing technologies and the absence of suitable actors. We overcome these issues by focusing on the 2D skeleton modality that can be easily extracted from ordinary videos. The objective of this work is to demonstrate a high descriptive power of such a 2D skeleton modality by achieving accuracy on the task of daily action recognition competitive to 3D skeleton data. More importantly, we thoroughly analyze the factors that significantly influence the 2D recognition accuracy, such as the sensitivity towards data normalization, scaling, quantization, and 3D-to-2D distortions in skeleton orientations and sizes, which are caused by the loss of depth dimension and fixed-angle camera view. We also provide valuable insights on how to mitigate these problems to increase recognition accuracy significantly. The experimental evaluation is conducted on three datasets different in nature. The ability to learn different types of actions better using either 2D or 3D skeletons is also reported. Throughout experiments, a generic light-weight LSTM network is used, whose architecture can be easily tuned to achieve the desired trade-off between its accuracy and efficiency. We show that the proposed approach achieves not only the state-of-the-art results in 2D skeleton action recognition but is also highly competitive to the best-performing methods classifying 3D skeleton sequences or the visual content extracted from ordinary videos.
Název v anglickém jazyce
Understanding the Limits of 2D Skeletons for Action Recognition
Popis výsledku anglicky
With the development of motion capture technologies, 3D action recognition has become a popular task that finds great applicability in many areas, such as augmented reality, human–computer interaction, sports, or healthcare. On the other hand, the acquisition of 3D human skeleton data is an expensive and time-consuming process, mainly due to the high costs of capturing technologies and the absence of suitable actors. We overcome these issues by focusing on the 2D skeleton modality that can be easily extracted from ordinary videos. The objective of this work is to demonstrate a high descriptive power of such a 2D skeleton modality by achieving accuracy on the task of daily action recognition competitive to 3D skeleton data. More importantly, we thoroughly analyze the factors that significantly influence the 2D recognition accuracy, such as the sensitivity towards data normalization, scaling, quantization, and 3D-to-2D distortions in skeleton orientations and sizes, which are caused by the loss of depth dimension and fixed-angle camera view. We also provide valuable insights on how to mitigate these problems to increase recognition accuracy significantly. The experimental evaluation is conducted on three datasets different in nature. The ability to learn different types of actions better using either 2D or 3D skeletons is also reported. Throughout experiments, a generic light-weight LSTM network is used, whose architecture can be easily tuned to achieve the desired trade-off between its accuracy and efficiency. We show that the proposed approach achieves not only the state-of-the-art results in 2D skeleton action recognition but is also highly competitive to the best-performing methods classifying 3D skeleton sequences or the visual content extracted from ordinary videos.

Klasifikace

Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10200 - Computer and information sciences

Návaznosti výsledku

Projekt
<a href="/cs/project/GA19-02033S" target="_blank" >GA19-02033S: Vyhledávání, analytika a anotace datových toků lidských pohybů</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Multimedia Systems
ISSN
0942-4962
e-ISSN
1432-1882
Svazek periodika
27
Číslo periodika v rámci svazku
3
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
15
Strana od-do
547-561
Kód UT WoS článku
000615767700001
EID výsledku v databázi Scopus
2-s2.0-85100576467

Podobné výsledky(10)

Understanding the Gap between 2D and 3D Skeleton-Based Action Recognition Augmenting Spatio-Temporal Human Motion Data for Effective 3D Action Recognition Towards Efficient Human Action Retrieval based on Triplet-Loss Metric Learning

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Understanding the Limits of 2D Skeletons for Action Recognition

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)