Similarity Search with the Distance Density Model
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F22%3A00127335" target="_blank" >RIV/00216224:14330/22:00127335 - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007/978-3-031-17849-8_10" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-031-17849-8_10</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-031-17849-8_10" target="_blank" >10.1007/978-3-031-17849-8_10</a>
Alternative languages
Result language
angličtina
Original language name
Similarity Search with the Distance Density Model
Original language description
The metric space model of similarity has become a standard formal paradigm of generic similarity search engine implementations. However, the constraints of identity and symmetry prevent from expressing the subjectivity and dependence on the context perceived by humans. In this paper, we study the suitability of the Distance density model of similarity for searching. First, we use the Local Outlier Factor (LOF) to estimate a data density in search collections and evaluate plenty of queries using the standard geometric model and its extension respecting the densities. We let 200 people assess the search effectiveness of the two alternatives using the web interface. Encouraged by the positive effects of the Distance density model, we propose an alternative way to estimate the data densities to avoid the quadratic LOF computation complexity with respect to the dataset size. The sketches with unbalanced bits are clarified to be in correlation with LOFs, which opens a possibility for an efficient implementation of large-scale similarity search systems based on the Distance density model.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/EF16_019%2F0000822" target="_blank" >EF16_019/0000822: CyberSecurity, CyberCrime and Critical Information Infrastructures Center of Excellence</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2022
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Similarity Search and Applications: 15th International Conference, SISAP 2022, Bologna, Italy, October 5 - October 7, 2020, Proceedings
ISBN
9783031178481
ISSN
0302-9743
e-ISSN
1611-3349
Number of pages
15
Pages from-to
118-132
Publisher name
Springer
Place of publication
Cham
Event location
Bologna, Itálie
Event date
Jan 1, 2022
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000874756300010