Web Scale Image Clustering
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F08%3A00160874" target="_blank" >RIV/68407700:21230/08:00160874 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Web Scale Image Clustering
Popis výsledku v původním jazyce
We propose a randomized data mining method that finds clusters of spatially overlapping images. The core of the method relies on the min-Hash algorithm for fast detection of so-called cluster seeds. The seeds are then used as visual queries to obtain clusters which are formed as transitive closures of sets of partially overlapping images that include the seed. We show that the probability of finding a seed for an image cluster rapidly increases with the size of the cluster. The properties and performance of the algorithm are demonstrated on datasets with 104 and 105 images. The speed of the method depends on the size of the database and is close to linear for databases sizes up to approximately 234 1010 images. The proposed algorithm provides, as a side effect, a state-of-the-art near duplicate image detection.
Název v anglickém jazyce
Web Scale Image Clustering
Popis výsledku anglicky
We propose a randomized data mining method that finds clusters of spatially overlapping images. The core of the method relies on the min-Hash algorithm for fast detection of so-called cluster seeds. The seeds are then used as visual queries to obtain clusters which are formed as transitive closures of sets of partially overlapping images that include the seed. We show that the probability of finding a seed for an image cluster rapidly increases with the size of the cluster. The properties and performance of the algorithm are demonstrated on datasets with 104 and 105 images. The speed of the method depends on the size of the database and is close to linear for databases sizes up to approximately 234 1010 images. The proposed algorithm provides, as a side effect, a state-of-the-art near duplicate image detection.
Klasifikace
Druh
O - Ostatní výsledky
CEP obor
JD - Využití počítačů, robotika a její aplikace
OECD FORD obor
—
Návaznosti výsledku
Projekt
<a href="/cs/project/7E08031" target="_blank" >7E08031: Dynamic Interactive Perception-action Learning in Cognitive Systems</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2008
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů