Efficient algorithms for mining clickstream patterns using pseudo-IDLists
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F70883521%3A28140%2F20%3A63524899" target="_blank" >RIV/70883521:28140/20:63524899 - isvavai.cz</a>
Result on the web
<a href="https://www.sciencedirect.com/science/article/pii/S0167739X19314475" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0167739X19314475</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.future.2020.01.034" target="_blank" >10.1016/j.future.2020.01.034</a>
Alternative languages
Result language
angličtina
Original language name
Efficient algorithms for mining clickstream patterns using pseudo-IDLists
Original language description
Sequential pattern mining is an important task in data mining. Its subproblem, clickstream pattern mining, is starting to attract more research due to the growth of the Internet and the need to analyze online customer behaviors. To date, only few works are dedicately proposed for the problem of mining clickstream patterns. Although one approach is to use the general algorithms for sequential pattern mining, those algorithms’ performance may suffer and the resources needed are more than would be necessary with a dedicated method for mining clickstreams. In this paper, we present pseudo-IDList, a novel data structure that is more suitable for clickstream pattern mining. Based on this structure, a vertical format algorithm named CUP (Clickstream pattern mining Using Pseudo-IDList) is proposed. Furthermore, we propose a pruning heuristic named DUB (Dynamic intersection Upper Bound) to improve our proposed algorithm. Four real-life clickstream databases are used for the experiments and the results show that our proposed methods are effective and efficient regarding runtimes and memory consumption. © 2020 Elsevier B.V.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2020
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
FUTURE GENERATION COMPUTER SYSTEMS
ISSN
0167-739X
e-ISSN
—
Volume of the periodical
107
Issue of the periodical within the volume
Neuvedeno
Country of publishing house
NL - THE KINGDOM OF THE NETHERLANDS
Number of pages
13
Pages from-to
18-30
UT code for WoS article
000527331800002
EID of the result in the Scopus database
2-s2.0-85078857727