BM-index: Balanced Metric Space Index based on Weighted Voronoi Partitioning
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F19%3A00109747" target="_blank" >RIV/00216224:14330/19:00109747 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/chapter/10.1007/978-3-030-28730-6_21" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-030-28730-6_21</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-030-28730-6_21" target="_blank" >10.1007/978-3-030-28730-6_21</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
BM-index: Balanced Metric Space Index based on Weighted Voronoi Partitioning
Popis výsledku v původním jazyce
Processing large volumes of various data needs index structures that can efficiently organize them on secondary memory. Methods based on so-called pivot permutations have become popular in addressing these requirements because of their tremendous querying performance. They localize data objects by ordering preselected anchor objects by their distances to the data objects, and so no coordinate system is exploited to partition the data. This represents a generic solution for unstructured and high-dimensional data. In principle, pivot permutations implement recursive Voronoi tessellation. Also, due to the fixed preselected anchors, such partitioning cannot adapt to the data distribution and leads to very unbalanced cells. In this paper, we address this issue and propose a novel schema called BM-index. It exploits weighted Voronoi partitioning to create pivot permutations that adapt to data distribution. Secondary memory is then accessed efficiently with respect to the existing disk-oriented structures, such as M-index. We present an algorithm to balance the data partitions, and we show its correctness. In experiments on a real-life image collection CoPhIR, we show superior performance in I/O costs when evaluating k-nearest neighbors queries.
Název v anglickém jazyce
BM-index: Balanced Metric Space Index based on Weighted Voronoi Partitioning
Popis výsledku anglicky
Processing large volumes of various data needs index structures that can efficiently organize them on secondary memory. Methods based on so-called pivot permutations have become popular in addressing these requirements because of their tremendous querying performance. They localize data objects by ordering preselected anchor objects by their distances to the data objects, and so no coordinate system is exploited to partition the data. This represents a generic solution for unstructured and high-dimensional data. In principle, pivot permutations implement recursive Voronoi tessellation. Also, due to the fixed preselected anchors, such partitioning cannot adapt to the data distribution and leads to very unbalanced cells. In this paper, we address this issue and propose a novel schema called BM-index. It exploits weighted Voronoi partitioning to create pivot permutations that adapt to data distribution. Secondary memory is then accessed efficiently with respect to the existing disk-oriented structures, such as M-index. We present an algorithm to balance the data partitions, and we show its correctness. In experiments on a real-life image collection CoPhIR, we show superior performance in I/O costs when evaluating k-nearest neighbors queries.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/EF16_019%2F0000822" target="_blank" >EF16_019/0000822: Centrum excelence pro kyberkriminalitu, kyberbezpečnost a ochranu kritických informačních infrastruktur</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2019
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Advances in Databases and Information Systems, 23th East European Conference, ADBIS 2019
ISBN
9783030287290
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
17
Strana od-do
337-353
Název nakladatele
Springer International Publishing
Místo vydání
Cham
Místo konání akce
Slovenia
Datum konání akce
1. 1. 2019
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000558104700024