Modifying Hamming Spaces for Efficient Search
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F18%3A00103668" target="_blank" >RIV/00216224:14330/18:00103668 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1109/ICDMW.2018.00137" target="_blank" >http://dx.doi.org/10.1109/ICDMW.2018.00137</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ICDMW.2018.00137" target="_blank" >10.1109/ICDMW.2018.00137</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Modifying Hamming Spaces for Efficient Search
Popis výsledku v původním jazyce
We focus on the efficient search for the most similar bit strings to a given query in the Hamming space. The distance of this space can be lower-bounded by a function based on a difference of the number of ones in the compared strings, i.e. their weights. Recently, such property has been successfully used by the Hamming Weight Tree (HWT) indexing structure. We propose modifications of the bit strings that preserve pairwise Hamming distances but improve the tightness of these lower bounds, so the query evaluation with the HWT is several times faster. We also show that the unbalanced bit strings, recently reported to provide similar quality of search as the traditionally used balanced bit strings, are more easy to index with the HWT. Combined with the distance preserving modifications, the HWT query evaluation can be more than one order of magnitude faster than the HWT baseline. Finally, we show that such modifications are useful even for a very complex data where the search with the HWT is slower than a sequential search.
Název v anglickém jazyce
Modifying Hamming Spaces for Efficient Search
Popis výsledku anglicky
We focus on the efficient search for the most similar bit strings to a given query in the Hamming space. The distance of this space can be lower-bounded by a function based on a difference of the number of ones in the compared strings, i.e. their weights. Recently, such property has been successfully used by the Hamming Weight Tree (HWT) indexing structure. We propose modifications of the bit strings that preserve pairwise Hamming distances but improve the tightness of these lower bounds, so the query evaluation with the HWT is several times faster. We also show that the unbalanced bit strings, recently reported to provide similar quality of search as the traditionally used balanced bit strings, are more easy to index with the HWT. Combined with the distance preserving modifications, the HWT query evaluation can be more than one order of magnitude faster than the HWT baseline. Finally, we show that such modifications are useful even for a very complex data where the search with the HWT is slower than a sequential search.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/EF16_019%2F0000822" target="_blank" >EF16_019/0000822: Centrum excelence pro kyberkriminalitu, kyberbezpečnost a ochranu kritických informačních infrastruktur</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2018
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
18th International Conference on Data Mining Workshops (ICDMW), Singapore, November 17-21, 2018
ISBN
9781538692882
ISSN
2375-9232
e-ISSN
—
Počet stran výsledku
9
Strana od-do
945-953
Název nakladatele
IEEE
Místo vydání
USA
Místo konání akce
Singapur
Datum konání akce
1. 1. 2018
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000465766800128