Learned Indexing in Proteins: Substituting Complex Distance Calculations with Embedding and Clustering Techniques
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F22%3A00126460" target="_blank" >RIV/00216224:14330/22:00126460 - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007/978-3-031-17849-8_22" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-031-17849-8_22</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-031-17849-8_22" target="_blank" >10.1007/978-3-031-17849-8_22</a>
Alternative languages
Result language
angličtina
Original language name
Learned Indexing in Proteins: Substituting Complex Distance Calculations with Embedding and Clustering Techniques
Original language description
Despite the constant evolution of similarity searching research, it continues to face challenges stemming from the complexity of the data, such as the curse of dimensionality and computationally expensive distance functions. Various machine learning techniques have proven capable of replacing elaborate mathematical models with simple linear functions, often gaining speed and simplicity at the cost of formal guarantees of accuracy and correctness of querying. The authors explore the potential of this research trend by presenting a lightweight solution for the complex problem of 3D protein structure search. The solution consists of three steps – (i) transformation of 3D protein structural information into very compact vectors, (ii) use of a probabilistic model to group these vectors and respond to queries by returning a given number of similar objects, and (iii) a final filtering step which applies basic vector distance functions to refine the result.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10200 - Computer and information sciences
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Others
Publication year
2022
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Similarity Search and Applications, 15th International Conference, SISAP 2022, Bologna, Italy, October 5–7, 2022, Proceedings
ISBN
9783031178481
ISSN
0302-9743
e-ISSN
1611-3349
Number of pages
9
Pages from-to
274-282
Publisher name
Springer Cham
Place of publication
Cham
Event location
Bologna, Italy
Event date
Oct 5, 2020
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000874756300022