HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F24%3A00372774" target="_blank" >RIV/68407700:21230/24:00372774 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1007/s11263-023-01982-9" target="_blank" >https://doi.org/10.1007/s11263-023-01982-9</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s11263-023-01982-9" target="_blank" >10.1007/s11263-023-01982-9</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer
Popis výsledku v původním jazyce
Visual localization is critical to many applications in computer vision and robotics. To address single-image RGB localization, state-of-the-art feature-based methods match local descriptors between a query image and a pre-built 3D model. Recently, deep neural networks have been exploited to regress the mapping between raw pixels and 3D coordinates in the scene, and thus the matching is implicitly performed by the forward pass through the network. However, in a large and ambiguous environment, learning such a regression task directly can be difficult for a single network. In this work, we present a new hierarchical scene coordinate network to predict pixel scene coordinates in a coarse-to-fine manner from a single RGB image. The proposed method, which is an extension of HSCNet, allows us to train compact models which scale robustly to large environments. It sets a new state-of-the-art for single-image localization on the 7-Scenes, 12-Scenes, Cambridge Landmarks datasets, and the combined indoor scenes.
Název v anglickém jazyce
HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer
Popis výsledku anglicky
Visual localization is critical to many applications in computer vision and robotics. To address single-image RGB localization, state-of-the-art feature-based methods match local descriptors between a query image and a pre-built 3D model. Recently, deep neural networks have been exploited to regress the mapping between raw pixels and 3D coordinates in the scene, and thus the matching is implicitly performed by the forward pass through the network. However, in a large and ambiguous environment, learning such a regression task directly can be difficult for a single network. In this work, we present a new hierarchical scene coordinate network to predict pixel scene coordinates in a coarse-to-fine manner from a single RGB image. The proposed method, which is an extension of HSCNet, allows us to train compact models which scale robustly to large environments. It sets a new state-of-the-art for single-image localization on the 7-Scenes, 12-Scenes, Cambridge Landmarks datasets, and the combined indoor scenes.

Klasifikace

Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
International Journal of Computer Vision
ISSN
0920-5691
e-ISSN
1573-1405
Svazek periodika
132
Číslo periodika v rámci svazku
7
Stát vydavatele periodika
NL - Nizozemsko
Počet stran výsledku
21
Strana od-do
2530-2550
Kód UT WoS článku
001156667100002
EID výsledku v databázi Scopus
2-s2.0-85187172970

Podobné výsledky(10)

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose EPOS: Estimating 6D Pose of Objects with Symmetries D2-Net: A Trainable CNN for Joint Description and Detection of Local Features

Co hledáte?

Rychlé hledání

Chytré vyhledávání

HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)