Weakly Supervised Human-Object Interaction Detection in Video via Contrastive Spatiotemporal Regions
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21730%2F21%3A00356152" target="_blank" >RIV/68407700:21730/21:00356152 - isvavai.cz</a>
Result on the web
<a href="https://doi.org/10.1109/ICCV48922.2021.00186" target="_blank" >https://doi.org/10.1109/ICCV48922.2021.00186</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ICCV48922.2021.00186" target="_blank" >10.1109/ICCV48922.2021.00186</a>
Alternative languages
Result language
angličtina
Original language name
Weakly Supervised Human-Object Interaction Detection in Video via Contrastive Spatiotemporal Regions
Original language description
We introduce the task of weakly supervised learning for detecting human and object interactions in videos. Our task poses unique challenges as a system does not know what types of human-object interactions are present in a video or the actual spatiotemporal location of the human and the object. To address these challenges, we introduce a contrastive weakly supervised training loss that aims to jointly associate spatiotemporal regions in a video with an action and object vocabulary and encourage temporal continuity of the visual appearance of moving objects as a form of self-supervision. To train our model, we introduce a dataset comprising over 6.5k videos with human-object interaction annotations that have been semi-automatically curated from sentence captions associated with the videos. We demonstrate improved performance over weakly supervised baselines adapted to our task on our video dataset.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/EF15_003%2F0000468" target="_blank" >EF15_003/0000468: Intelligent Machine Perception</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
ICCV2021: Proceedings of the International Conference on Computer Vision
ISBN
978-1-6654-2812-5
ISSN
1550-5499
e-ISSN
2380-7504
Number of pages
11
Pages from-to
1825-1835
Publisher name
IEEE
Place of publication
Piscataway
Event location
Montreal
Event date
Oct 11, 2021
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000797698902003