All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-Modal Distillation

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21730%2F22%3A00359337" target="_blank" >RIV/68407700:21730/22:00359337 - isvavai.cz</a>

  • Result on the web

    <a href="https://doi.org/10.1007/978-3-031-19839-7_28" target="_blank" >https://doi.org/10.1007/978-3-031-19839-7_28</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1007/978-3-031-19839-7_28" target="_blank" >10.1007/978-3-031-19839-7_28</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-Modal Distillation

  • Original language description

    This work investigates learning pixel-wise semantic image segmentation in urban scenes without any manual annotation, just from the raw non-curated data collected by cars which, equipped with cameras and LiDAR sensors, drive around a city. Our contributions are threefold. First, we propose a novel method for cross-modal unsupervised learning of semantic image segmentation by leveraging synchronized LiDAR and image data. The key ingredient of our method is the use of an object proposal module that analyzes the LiDAR point cloud to obtain proposals for spatially consistent objects. Second, we show that these 3D object proposals can be aligned with the input images and reliably clustered into semantically meaningful pseudo-classes. Finally, we develop a cross-modal distillation approach that leverages image data partially annotated with the resulting pseudo-classes to train a transformer-based model for image semantic segmentation. We show the generalization capabilities of our method by testing on four different testing datasets (Cityscapes, Dark Zurich, Nighttime Driving and ACDC) without any finetuning, and demonstrate significant improvements compared to the current state of the art on this problem.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

  • Continuities

    S - Specificky vyzkum na vysokych skolach

Others

  • Publication year

    2022

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Computer Vision – ECCV 2022, Part XXXVIII

  • ISBN

    978-3-031-19838-0

  • ISSN

    0302-9743

  • e-ISSN

    1611-3349

  • Number of pages

    18

  • Pages from-to

    478-495

  • Publisher name

    Springer

  • Place of publication

    Cham

  • Event location

    Tel Aviv

  • Event date

    Oct 23, 2022

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article

    000903760400028