Speeding up Continuous kNN Join by Binary Sketches
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F18%3A00100950" target="_blank" >RIV/00216224:14330/18:00100950 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.1007/978-3-319-95786-9_14" target="_blank" >http://dx.doi.org/10.1007/978-3-319-95786-9_14</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-95786-9_14" target="_blank" >10.1007/978-3-319-95786-9_14</a>
Alternative languages
Result language
angličtina
Original language name
Speeding up Continuous kNN Join by Binary Sketches
Original language description
Real-time recommendation is a necessary component of current social applications. It is responsible for suggesting relevant newly published data to the users based on their preferences. By representing the users and the published data in a metric space, each user can be recommended with their k nearest neighbors among the published data, i.e., the kNN join is computed. In this work, we aim at a frequent requirement that only the recently published data are subject of the recommendation, thus a sliding time window is defined and only the data published within the limits of the window can be recommended. Due to large amounts of both the users and the published data, it becomes a challenging task to continuously update the results of the kNN join as new data come into and go out of the sliding window. We propose a binary sketch-based approximation technique suited especially to cases when the metric distance computation is an expensive operation (e.g., the Euclidean distance in high dimensional vector spaces). It applies cheap Hamming distances to skip over 90% of the expensive metric distance computations. As revealed by our experiments on 4,096 dimensional vectors, the proposed approach significantly outperforms compared existing approaches.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/GA16-18889S" target="_blank" >GA16-18889S: Big Data Analytics for Unstructured Data</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2018
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Advances in Data Mining
ISBN
9783319957852
ISSN
0302-9743
e-ISSN
—
Number of pages
16
Pages from-to
183-198
Publisher name
Springer
Place of publication
Cham
Event location
New York, USA
Event date
Jul 11, 2018
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—