CRANBERRY: Memory-Effective Search in 100M High-Dimensional CLIP Vectors
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F23%3A00131529" target="_blank" >RIV/00216224:14330/23:00131529 - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007/978-3-031-46994-7_26" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-031-46994-7_26</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-031-46994-7_26" target="_blank" >10.1007/978-3-031-46994-7_26</a>
Alternative languages
Result language
angličtina
Original language name
CRANBERRY: Memory-Effective Search in 100M High-Dimensional CLIP Vectors
Original language description
Recent advances in cross-modal multimedia data analysis necessarily require efficient similarity search on the scales of hundreds of millions of high-dimensional vectors. We address this task by proposing the CRANBERRY algorithm that specifically combines and tunes several existing similarity search strategies. In particular, the algorithm: (1) employs the Voronoi partitioning to obtain a query-relevant candidate set in constant time, (2) applies filtering techniques to prune the obtained candidates significantly, and (3) re-rank the retained candidate vectors with respect to the query vector. Applied to the dataset of 100 million 768-dimensional vectors, the algorithm evaluates 10NN queries with 90% recall and query latency of 1.2s on average, all with a throughput of 15 queries per second on a server with 56 core-CPU, and 4.7q/sec. on a PC.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10200 - Computer and information sciences
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
16th International Conference on Similarity Search and Applications (SISAP)
ISBN
9783031469930
ISSN
0302-9743
e-ISSN
1611-3349
Number of pages
9
Pages from-to
300-308
Publisher name
Springer
Place of publication
Cham
Event location
A Coruña, Spain
Event date
Jan 1, 2023
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—