Distributed Asynchronous Regular Path Queries (RPQs) on Graphs
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3A10474284" target="_blank" >RIV/00216208:11320/23:10474284 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1145/3626562.3626833" target="_blank" >https://doi.org/10.1145/3626562.3626833</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1145/3626562.3626833" target="_blank" >10.1145/3626562.3626833</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Distributed Asynchronous Regular Path Queries (RPQs) on Graphs
Popis výsledku v původním jazyce
Graph engines play a crucial role in modern data analytics pipelines, serving as a middleware for handling complex queries across various domains, such as financial fraud detection. Graph queries enable flexible exploration and analysis, akin to SQL in relational databases. Among the most expressive and powerful constructs of graph querying are regular path queries (RPQs). RPQs enable support for variable-length path patterns based on regular expressions, such as (p1:Person)-/:Knows+/->(p2:Person) that searches for non-empty paths of any length between two persons.In this paper, we introduce a novel design for distributed RPQs that builds on top of distributed asynchronous pipelined traversals to enable (i) memory control of path explorations, with (ii) great performance and scalability. Through our evaluation, we show that with sixteen machines, it outperforms Neo4j by 91x on average and a relational implementation of the same queries in PostgreSQL by 230x, while maintaining low memory consumption.
Název v anglickém jazyce
Distributed Asynchronous Regular Path Queries (RPQs) on Graphs
Popis výsledku anglicky
Graph engines play a crucial role in modern data analytics pipelines, serving as a middleware for handling complex queries across various domains, such as financial fraud detection. Graph queries enable flexible exploration and analysis, akin to SQL in relational databases. Among the most expressive and powerful constructs of graph querying are regular path queries (RPQs). RPQs enable support for variable-length path patterns based on regular expressions, such as (p1:Person)-/:Knows+/->(p2:Person) that searches for non-empty paths of any length between two persons.In this paper, we introduce a novel design for distributed RPQs that builds on top of distributed asynchronous pipelined traversals to enable (i) memory control of path explorations, with (ii) great performance and scalability. Through our evaluation, we show that with sixteen machines, it outperforms Neo4j by 91x on average and a relational implementation of the same queries in PostgreSQL by 230x, while maintaining low memory consumption.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Middleware '23: Proceedings of the 24th International Middleware Conference: Industrial Track
ISBN
979-8-4007-0427-7
ISSN
—
e-ISSN
—
Počet stran výsledku
7
Strana od-do
35-41
Název nakladatele
Association for Computing Machinery
Místo vydání
New York, United States
Místo konání akce
Bologna Italy
Datum konání akce
11. 12. 2023
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—