An efficient method for mining frequent sequential patterns using multi-Core processors
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F17%3A10238742" target="_blank" >RIV/61989100:27240/17:10238742 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/article/10.1007%2Fs10489-016-0859-y" target="_blank" >https://link.springer.com/article/10.1007%2Fs10489-016-0859-y</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10489-016-0859-y" target="_blank" >10.1007/s10489-016-0859-y</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
An efficient method for mining frequent sequential patterns using multi-Core processors
Popis výsledku v původním jazyce
The problem of mining frequent sequential patterns (FSPs) has attracted a great deal of research attention. Although there are many efficient algorithms for mining FSPs, the mining time is still high, especially for large or dense datasets. Parallel processing has been widely applied to improve processing speed for various problems. Some parallel algorithms have been proposed, but most of them have problems related to synchronization and load balancing. Based on a multi-core processor architecture, this paper proposes a load-balancing parallel approach called Parallel Dynamic Bit Vector Sequential Pattern Mining (pDBV-SPM) for mining FSPs from huge datasets using the dynamic bit vector data structure for fast determining support values. In the pDBV-SPM approach, the support count is sorted in ascending order before the set of frequent 1-sequences is partitioned into parts, each of which is assigned to a task on a processor so that most of the nodes in the leftmost branches will be infrequent and thus pruned during the search; this strategy helps to better balance the search tree. Experiments are conducted to verify the effectiveness of pDBV-SPM. The experimental results show that the proposed algorithm outperforms PIB-PRISM for mining FSPs in terms of mining time and memory usage.
Název v anglickém jazyce
An efficient method for mining frequent sequential patterns using multi-Core processors
Popis výsledku anglicky
The problem of mining frequent sequential patterns (FSPs) has attracted a great deal of research attention. Although there are many efficient algorithms for mining FSPs, the mining time is still high, especially for large or dense datasets. Parallel processing has been widely applied to improve processing speed for various problems. Some parallel algorithms have been proposed, but most of them have problems related to synchronization and load balancing. Based on a multi-core processor architecture, this paper proposes a load-balancing parallel approach called Parallel Dynamic Bit Vector Sequential Pattern Mining (pDBV-SPM) for mining FSPs from huge datasets using the dynamic bit vector data structure for fast determining support values. In the pDBV-SPM approach, the support count is sorted in ascending order before the set of frequent 1-sequences is partitioned into parts, each of which is assigned to a task on a processor so that most of the nodes in the leftmost branches will be infrequent and thus pruned during the search; this strategy helps to better balance the search tree. Experiments are conducted to verify the effectiveness of pDBV-SPM. The experimental results show that the proposed algorithm outperforms PIB-PRISM for mining FSPs in terms of mining time and memory usage.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Applied Intelligence
ISSN
0924-669X
e-ISSN
—
Svazek periodika
46
Číslo periodika v rámci svazku
3
Stát vydavatele periodika
NL - Nizozemsko
Počet stran výsledku
14
Strana od-do
703-7016
Kód UT WoS článku
000398110300014
EID výsledku v databázi Scopus
2-s2.0-84994378107