An efficient parallel algorithm for mining weighted clickstream patterns
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F70883521%3A28140%2F22%3A63556059" target="_blank" >RIV/70883521:28140/22:63556059 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.sciencedirect.com/science/article/pii/S0020025521008781?via%3Dihub" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0020025521008781?via%3Dihub</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.ins.2021.08.070" target="_blank" >10.1016/j.ins.2021.08.070</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
An efficient parallel algorithm for mining weighted clickstream patterns
Popis výsledku v původním jazyce
In the Internet age, analyzing the behavior of online users can help webstore owners understand customers’ interests. Insights from such analysis can be used to improve both user experience and website design. A prominent task for online behavior analysis is clickstream mining, which consists of identifying customer browsing patterns that reveal how users interact with websites. Recently, this task was extended to consider weights to find more impactful patterns. However, most algorithms for mining weighted clickstream patterns are serial algorithms, which are sequentially executed from the start to the end on one running thread. In real life, data is often very large, and serial algorithms can have long runtimes as they do not fully take advantage of the parallelism capabilities of modern multi-core CPUs. To address this limitation, this paper presents two parallel algorithms named DPCompact-SPADE (Depth load balancing Parallel Compact-SPADE) and APCompact-SPADE (Adaptive Parallel Compact-SPADE) for weighted clickstream pattern mining. Experiments on various datasets show that the proposed parallel algorithm is efficient, and outperforms state-of-the-art serial algorithms in terms of runtime, memory consumption, and scalability. © 2021 Elsevier Inc.
Název v anglickém jazyce
An efficient parallel algorithm for mining weighted clickstream patterns
Popis výsledku anglicky
In the Internet age, analyzing the behavior of online users can help webstore owners understand customers’ interests. Insights from such analysis can be used to improve both user experience and website design. A prominent task for online behavior analysis is clickstream mining, which consists of identifying customer browsing patterns that reveal how users interact with websites. Recently, this task was extended to consider weights to find more impactful patterns. However, most algorithms for mining weighted clickstream patterns are serial algorithms, which are sequentially executed from the start to the end on one running thread. In real life, data is often very large, and serial algorithms can have long runtimes as they do not fully take advantage of the parallelism capabilities of modern multi-core CPUs. To address this limitation, this paper presents two parallel algorithms named DPCompact-SPADE (Depth load balancing Parallel Compact-SPADE) and APCompact-SPADE (Adaptive Parallel Compact-SPADE) for weighted clickstream pattern mining. Experiments on various datasets show that the proposed parallel algorithm is efficient, and outperforms state-of-the-art serial algorithms in terms of runtime, memory consumption, and scalability. © 2021 Elsevier Inc.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
INFORMATION SCIENCES
ISSN
0020-0255
e-ISSN
1872-6291
Svazek periodika
582
Číslo periodika v rámci svazku
Neuveden
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
20
Strana od-do
349-368
Kód UT WoS článku
000705073700006
EID výsledku v databázi Scopus
2-s2.0-85115427566