Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

Incremental clickstream pattern mining with search boundaries

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F70883521%3A28140%2F24%3A63587598" target="_blank" >RIV/70883521:28140/24:63587598 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://www.sciencedirect.com/science/article/pii/S0020025524001701?via%3Dihub" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0020025524001701?via%3Dihub</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1016/j.ins.2024.120257" target="_blank" >10.1016/j.ins.2024.120257</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    Incremental clickstream pattern mining with search boundaries

  • Popis výsledku v původním jazyce

    Recently, there has been a growing interest in sequential pattern mining in data mining, with a particular focus on clickstream pattern mining. These areas hold the potential for discovering valuable patterns. However, traditional mining algorithms in these domains often assume that databases are static, simplifying the mining process. In reality, databases are updated incrementally over time, partially rendering a portion of the previous results invalid. This necessitates rerunning algorithms on updated databases to obtain accurate frequent patterns. As database size increases, this approach can become time-consuming and affect performance. To tackle this issue, we propose PSB-CUP to mine frequent clickstream patterns in an incremental update manner. PSB-CUP employs the concept of search borders to reduce the search space and the information retained in memory. Furthermore, an IDList generation method called &quot;partial imbalance join&quot; was proposed to reconstruct possibly missing information during the incremental process. This join method, however, requires more extra information to be cached in exchange for speed. We then improve this technique by introducing &quot;recursive imbalance join&quot;, removing the need for extra cached data in the PSB-CUP + algorithm. The experimental results show that our proposed algorithms are efficient for incremental clickstream pattern mining.

  • Název v anglickém jazyce

    Incremental clickstream pattern mining with search boundaries

  • Popis výsledku anglicky

    Recently, there has been a growing interest in sequential pattern mining in data mining, with a particular focus on clickstream pattern mining. These areas hold the potential for discovering valuable patterns. However, traditional mining algorithms in these domains often assume that databases are static, simplifying the mining process. In reality, databases are updated incrementally over time, partially rendering a portion of the previous results invalid. This necessitates rerunning algorithms on updated databases to obtain accurate frequent patterns. As database size increases, this approach can become time-consuming and affect performance. To tackle this issue, we propose PSB-CUP to mine frequent clickstream patterns in an incremental update manner. PSB-CUP employs the concept of search borders to reduce the search space and the information retained in memory. Furthermore, an IDList generation method called &quot;partial imbalance join&quot; was proposed to reconstruct possibly missing information during the incremental process. This join method, however, requires more extra information to be cached in exchange for speed. We then improve this technique by introducing &quot;recursive imbalance join&quot;, removing the need for extra cached data in the PSB-CUP + algorithm. The experimental results show that our proposed algorithms are efficient for incremental clickstream pattern mining.

Klasifikace

  • Druh

    J<sub>imp</sub> - Článek v periodiku v databázi Web of Science

  • CEP obor

  • OECD FORD obor

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

  • Projekt

  • Návaznosti

    S - Specificky vyzkum na vysokych skolach

Ostatní

  • Rok uplatnění

    2024

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název periodika

    INFORMATION SCIENCES

  • ISSN

    0020-0255

  • e-ISSN

    1872-6291

  • Svazek periodika

    662

  • Číslo periodika v rámci svazku

    Neuveden

  • Stát vydavatele periodika

    US - Spojené státy americké

  • Počet stran výsledku

    28

  • Strana od-do

    1-28

  • Kód UT WoS článku

    001182274800001

  • EID výsledku v databázi Scopus

    2-s2.0-85185494013