Incremental clickstream pattern mining with search boundaries
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F70883521%3A28140%2F24%3A63587598" target="_blank" >RIV/70883521:28140/24:63587598 - isvavai.cz</a>
Result on the web
<a href="https://www.sciencedirect.com/science/article/pii/S0020025524001701?via%3Dihub" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0020025524001701?via%3Dihub</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.ins.2024.120257" target="_blank" >10.1016/j.ins.2024.120257</a>
Alternative languages
Result language
angličtina
Original language name
Incremental clickstream pattern mining with search boundaries
Original language description
Recently, there has been a growing interest in sequential pattern mining in data mining, with a particular focus on clickstream pattern mining. These areas hold the potential for discovering valuable patterns. However, traditional mining algorithms in these domains often assume that databases are static, simplifying the mining process. In reality, databases are updated incrementally over time, partially rendering a portion of the previous results invalid. This necessitates rerunning algorithms on updated databases to obtain accurate frequent patterns. As database size increases, this approach can become time-consuming and affect performance. To tackle this issue, we propose PSB-CUP to mine frequent clickstream patterns in an incremental update manner. PSB-CUP employs the concept of search borders to reduce the search space and the information retained in memory. Furthermore, an IDList generation method called "partial imbalance join" was proposed to reconstruct possibly missing information during the incremental process. This join method, however, requires more extra information to be cached in exchange for speed. We then improve this technique by introducing "recursive imbalance join", removing the need for extra cached data in the PSB-CUP + algorithm. The experimental results show that our proposed algorithms are efficient for incremental clickstream pattern mining.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
INFORMATION SCIENCES
ISSN
0020-0255
e-ISSN
1872-6291
Volume of the periodical
662
Issue of the periodical within the volume
Neuveden
Country of publishing house
US - UNITED STATES
Number of pages
28
Pages from-to
1-28
UT code for WoS article
001182274800001
EID of the result in the Scopus database
2-s2.0-85185494013