Towards Faster Similarity Search by Dynamic Reordering of Streamed Queries
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F18%3A00101119" target="_blank" >RIV/00216224:14330/18:00101119 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.1007/978-3-662-58384-5_3" target="_blank" >http://dx.doi.org/10.1007/978-3-662-58384-5_3</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-662-58384-5_3" target="_blank" >10.1007/978-3-662-58384-5_3</a>
Alternative languages
Result language
angličtina
Original language name
Towards Faster Similarity Search by Dynamic Reordering of Streamed Queries
Original language description
Current era of digital data explosion calls for employment of content-based similarity search techniques, since traditional searchable metadata like annotations are not always available. In our work, we focus on a scenario where the similarity search is used in the context of stream processing, which is one of the suitable approaches to deal with huge amounts of data. Our goal is to maximize the throughput of processed queries while a slight delay is acceptable. We propose a technique that dynamically reorders the queries coming from the stream in order to use our caching mechanism in huge data spaces more effectively. We were able to achieve significantly higher throughput compared to the baseline when no reordering and no caching were used. Moreover, our proposal does not incur any additional precision loss of the similarity search, as opposed to some other caching techniques. In addition to the throughput maximization, we also study the potential of trading off the throughput for low delays (waiting times). The proposed technique allows to be parameterized by the amount of the throughput that can be sacrificed.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/GA16-18889S" target="_blank" >GA16-18889S: Big Data Analytics for Unstructured Data</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2018
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Transactions on Large-Scale Data- and Knowledge-Centered Systems XXXVIII
ISBN
9783662583838
ISSN
0302-9743
e-ISSN
—
Number of pages
28
Pages from-to
61-88
Publisher name
Springer
Place of publication
Berlin, Heidelberg
Event location
Berlin, Heidelberg
Event date
Jan 1, 2018
Type of event by nationality
CST - Celostátní akce
UT code for WoS article
—