Metaheuristic-driven space partitioning and ensemble learning for imbalanced classification
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F24%3A10257043" target="_blank" >RIV/61989100:27240/24:10257043 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.sciencedirect.com/science/article/pii/S1568494624010524?via%3Dihub" target="_blank" >https://www.sciencedirect.com/science/article/pii/S1568494624010524?via%3Dihub</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.asoc.2024.112278" target="_blank" >10.1016/j.asoc.2024.112278</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Metaheuristic-driven space partitioning and ensemble learning for imbalanced classification
Popis výsledku v původním jazyce
Imbalanced classification is a common issue in Machine Learning, particularly when misclassifying minor instances leads to significant costs. In literature, various strategies have been employed to address this problem. These include data-level, algorithm-level, cost-sensitive, and hybrid-level algorithms designed to tackle imbalanced problems. This paper aims to introduce a novel method that simultaneously enhances the ability of classification models to identify patterns more effectively and addresses imbalanced problems while minimizing alterations to the original data distribution. Our proposed framework combines ensemble learning, space partitioning, and the Synthetic Minority Oversampling Technique (SMOTE). This method decomposes the space into balanced sub-spaces and then trains an ensemble classifier based on these sub-spaces using a bagging approach. In the initial step, we develop a Space Partitioning by Metaheuristic algorithm (SPMH) to divide the space into multiple balanced subspaces. In the subsequent step, we present Imbalanced Classification by SPMH (ICSPMH) as a solution to imbalanced class problems. ICSPMH uses SPMH multiple times to divide the space into different sub-spaces, creating various sub-spaces each time. It then trains different classifiers for each portion of the space, creating an ensemble classifier through a bagging technique. To assess the performance of our proposed framework, we selected 44 well-known datasets for comparison with some state-of-the-art approaches. The results demonstrate that ICSPMH outperforms other competent methods and can potentially reduce the oversampling rate to zero. Additionally, an experiment indicated that the choice of metaheuristic algorithm in SPMH does not significantly impact the final performance. The paper also includes a correlation analysis between oversampling rate and final performance, revealing that the framework effectively eliminates imbalanced data problems with minimal changes to the original dataset. In summary, because ICSPMH applies fewer changes in data distribution and sets up local classifiers that improve classification performance, it looks like a promising method for classifying imbalanced datasets.
Název v anglickém jazyce
Metaheuristic-driven space partitioning and ensemble learning for imbalanced classification
Popis výsledku anglicky
Imbalanced classification is a common issue in Machine Learning, particularly when misclassifying minor instances leads to significant costs. In literature, various strategies have been employed to address this problem. These include data-level, algorithm-level, cost-sensitive, and hybrid-level algorithms designed to tackle imbalanced problems. This paper aims to introduce a novel method that simultaneously enhances the ability of classification models to identify patterns more effectively and addresses imbalanced problems while minimizing alterations to the original data distribution. Our proposed framework combines ensemble learning, space partitioning, and the Synthetic Minority Oversampling Technique (SMOTE). This method decomposes the space into balanced sub-spaces and then trains an ensemble classifier based on these sub-spaces using a bagging approach. In the initial step, we develop a Space Partitioning by Metaheuristic algorithm (SPMH) to divide the space into multiple balanced subspaces. In the subsequent step, we present Imbalanced Classification by SPMH (ICSPMH) as a solution to imbalanced class problems. ICSPMH uses SPMH multiple times to divide the space into different sub-spaces, creating various sub-spaces each time. It then trains different classifiers for each portion of the space, creating an ensemble classifier through a bagging technique. To assess the performance of our proposed framework, we selected 44 well-known datasets for comparison with some state-of-the-art approaches. The results demonstrate that ICSPMH outperforms other competent methods and can potentially reduce the oversampling rate to zero. Additionally, an experiment indicated that the choice of metaheuristic algorithm in SPMH does not significantly impact the final performance. The paper also includes a correlation analysis between oversampling rate and final performance, revealing that the framework effectively eliminates imbalanced data problems with minimal changes to the original dataset. In summary, because ICSPMH applies fewer changes in data distribution and sets up local classifiers that improve classification performance, it looks like a promising method for classifying imbalanced datasets.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
O - Projekt operacniho programu
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Applied Soft Computing
ISSN
1568-4946
e-ISSN
1872-9681
Svazek periodika
167
Číslo periodika v rámci svazku
Dec
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
22
Strana od-do
—
Kód UT WoS článku
001338601900001
EID výsledku v databázi Scopus
—