Efficient Large-Scale Semantic Visual Localization in 2D Maps
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F21%3A00349961" target="_blank" >RIV/68407700:21230/21:00349961 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1007/978-3-030-69535-4_17" target="_blank" >https://doi.org/10.1007/978-3-030-69535-4_17</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-030-69535-4_17" target="_blank" >10.1007/978-3-030-69535-4_17</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Efficient Large-Scale Semantic Visual Localization in 2D Maps
Popis výsledku v původním jazyce
With the emergence of autonomous navigation systems, image-based localization is one of the essential tasks to be tackled. However, most of the current algorithms struggle to scale to city-size environments mainly because of the need to collect large (semi-)annotated datasets for CNN training and create databases for test environment of images, key-point level features or image embeddings. This data acquisition is not only expensive and time-consuming but also may cause privacy concerns. In this work, we propose a novel framework for semantic visual localization in city-scale environments which alleviates the aforementioned problem by using freely available 2D maps such as OpenStreetMap. Our method does not require any images or image-map pairs for training or test environment database collection. Instead, a robust embedding is learned from a depth and building instance label information of a particular location in the 2D map. At test time, this embedding is extracted from a panoramic building instance label and depth images. It is then used to retrieve the closest match in the database. We evaluate our localization framework on two large-scale datasets consisting of Cambridge and San Francisco cities with a total length of drivable roads spanning 500 km and including approximately 110k unique locations. To the best of our knowledge, this is the first large-scale semantic localization method which works on par with approaches that require the availability of images at train time or for test environment database creation.
Název v anglickém jazyce
Efficient Large-Scale Semantic Visual Localization in 2D Maps
Popis výsledku anglicky
With the emergence of autonomous navigation systems, image-based localization is one of the essential tasks to be tackled. However, most of the current algorithms struggle to scale to city-size environments mainly because of the need to collect large (semi-)annotated datasets for CNN training and create databases for test environment of images, key-point level features or image embeddings. This data acquisition is not only expensive and time-consuming but also may cause privacy concerns. In this work, we propose a novel framework for semantic visual localization in city-scale environments which alleviates the aforementioned problem by using freely available 2D maps such as OpenStreetMap. Our method does not require any images or image-map pairs for training or test environment database collection. Instead, a robust embedding is learned from a depth and building instance label information of a particular location in the 2D map. At test time, this embedding is extracted from a panoramic building instance label and depth images. It is then used to retrieve the closest match in the database. We evaluate our localization framework on two large-scale datasets consisting of Cambridge and San Francisco cities with a total length of drivable roads spanning 500 km and including approximately 110k unique locations. To the best of our knowledge, this is the first large-scale semantic localization method which works on par with approaches that require the availability of images at train time or for test environment database creation.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
ACCV2020: Proceedings of the 15th Asian Conference on Computer Vision - Part III
ISBN
978-3-030-69534-7
ISSN
0302-9743
e-ISSN
1611-3349
Počet stran výsledku
16
Strana od-do
273-288
Název nakladatele
Springer
Místo vydání
Cham
Místo konání akce
Kyoto
Datum konání akce
30. 11. 2020
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—