Incremental clickstream pattern mining with search boundaries

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F70883521%3A28140%2F24%3A63587598" target="_blank" >RIV/70883521:28140/24:63587598 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.sciencedirect.com/science/article/pii/S0020025524001701?via%3Dihub" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0020025524001701?via%3Dihub</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.ins.2024.120257" target="_blank" >10.1016/j.ins.2024.120257</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Incremental clickstream pattern mining with search boundaries
Popis výsledku v původním jazyce
Recently, there has been a growing interest in sequential pattern mining in data mining, with a particular focus on clickstream pattern mining. These areas hold the potential for discovering valuable patterns. However, traditional mining algorithms in these domains often assume that databases are static, simplifying the mining process. In reality, databases are updated incrementally over time, partially rendering a portion of the previous results invalid. This necessitates rerunning algorithms on updated databases to obtain accurate frequent patterns. As database size increases, this approach can become time-consuming and affect performance. To tackle this issue, we propose PSB-CUP to mine frequent clickstream patterns in an incremental update manner. PSB-CUP employs the concept of search borders to reduce the search space and the information retained in memory. Furthermore, an IDList generation method called "partial imbalance join" was proposed to reconstruct possibly missing information during the incremental process. This join method, however, requires more extra information to be cached in exchange for speed. We then improve this technique by introducing "recursive imbalance join", removing the need for extra cached data in the PSB-CUP + algorithm. The experimental results show that our proposed algorithms are efficient for incremental clickstream pattern mining.
Název v anglickém jazyce
Incremental clickstream pattern mining with search boundaries
Popis výsledku anglicky
Recently, there has been a growing interest in sequential pattern mining in data mining, with a particular focus on clickstream pattern mining. These areas hold the potential for discovering valuable patterns. However, traditional mining algorithms in these domains often assume that databases are static, simplifying the mining process. In reality, databases are updated incrementally over time, partially rendering a portion of the previous results invalid. This necessitates rerunning algorithms on updated databases to obtain accurate frequent patterns. As database size increases, this approach can become time-consuming and affect performance. To tackle this issue, we propose PSB-CUP to mine frequent clickstream patterns in an incremental update manner. PSB-CUP employs the concept of search borders to reduce the search space and the information retained in memory. Furthermore, an IDList generation method called "partial imbalance join" was proposed to reconstruct possibly missing information during the incremental process. This join method, however, requires more extra information to be cached in exchange for speed. We then improve this technique by introducing "recursive imbalance join", removing the need for extra cached data in the PSB-CUP + algorithm. The experimental results show that our proposed algorithms are efficient for incremental clickstream pattern mining.

Klasifikace

Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
INFORMATION SCIENCES
ISSN
0020-0255
e-ISSN
1872-6291
Svazek periodika
662
Číslo periodika v rámci svazku
Neuveden
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
28
Strana od-do
1-28
Kód UT WoS článku
001182274800001
EID výsledku v databázi Scopus
2-s2.0-85185494013

Podobné výsledky(10)

An approach for incremental mining of clickstream patterns as a service application Efficient algorithms for mining clickstream patterns using pseudo-IDLists An efficient method for mining sequential patterns with indices

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Incremental clickstream pattern mining with search boundaries

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)