A novel firefly algorithm approach for efficient feature selection with COVID-19 dataset
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62690094%3A18470%2F23%3A50020594" target="_blank" >RIV/62690094:18470/23:50020594 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.sciencedirect.com/science/article/pii/S0141933123000248?via%3Dihub" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0141933123000248?via%3Dihub</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.micpro.2023.104778" target="_blank" >10.1016/j.micpro.2023.104778</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
A novel firefly algorithm approach for efficient feature selection with COVID-19 dataset
Popis výsledku v původním jazyce
Feature selection is one of the most important challenges in machine learning and data science. This process is usually performed in the data preprocessing phase, where the data is transformed to a proper format for further operations by machine learning algorithm. Many real-world datasets are highly dimensional with many irrelevant, even redundant features. These kinds of features do not improve classification accuracy and can even shrink down performance of a classifier. The goal of feature selection is to find optimal (or sub-optimal) subset of features that contain relevant information about the dataset from which machine learning algorithms can derive useful conclusions. In this manuscript, a novel version of firefly algorithm (FA) is proposed and adapted for feature selection challenge. Proposed method significantly improves performance of the basic FA, and also outperforms other state-of-the-art metaheuristics for both, benchmark bound-constrained and practical feature selection tasks. Method was first validated on standard unconstrained benchmarks and later it was applied for feature selection by using 21 standard University of California, Irvine (UCL) datasets. Moreover, presented approach was also tested for relatively novel COVID-19 dataset for predicting patients health, and one microcontroller microarray dataset. Results obtained in all practical simulations attest robustness and efficiency of proposed algorithm in terms of convergence, solutions' quality and classification accuracy. More precisely, the proposed approach obtained the best classification accuracy on 13 out of 21 total datasets, significantly outperforming other competitor methods.
Název v anglickém jazyce
A novel firefly algorithm approach for efficient feature selection with COVID-19 dataset
Popis výsledku anglicky
Feature selection is one of the most important challenges in machine learning and data science. This process is usually performed in the data preprocessing phase, where the data is transformed to a proper format for further operations by machine learning algorithm. Many real-world datasets are highly dimensional with many irrelevant, even redundant features. These kinds of features do not improve classification accuracy and can even shrink down performance of a classifier. The goal of feature selection is to find optimal (or sub-optimal) subset of features that contain relevant information about the dataset from which machine learning algorithms can derive useful conclusions. In this manuscript, a novel version of firefly algorithm (FA) is proposed and adapted for feature selection challenge. Proposed method significantly improves performance of the basic FA, and also outperforms other state-of-the-art metaheuristics for both, benchmark bound-constrained and practical feature selection tasks. Method was first validated on standard unconstrained benchmarks and later it was applied for feature selection by using 21 standard University of California, Irvine (UCL) datasets. Moreover, presented approach was also tested for relatively novel COVID-19 dataset for predicting patients health, and one microcontroller microarray dataset. Results obtained in all practical simulations attest robustness and efficiency of proposed algorithm in terms of convergence, solutions' quality and classification accuracy. More precisely, the proposed approach obtained the best classification accuracy on 13 out of 21 total datasets, significantly outperforming other competitor methods.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
MICROPROCESSORS AND MICROSYSTEMS
ISSN
0141-9331
e-ISSN
1872-9436
Svazek periodika
98
Číslo periodika v rámci svazku
April
Stát vydavatele periodika
NL - Nizozemsko
Počet stran výsledku
21
Strana od-do
"Article Number: 104778"
Kód UT WoS článku
000993462100001
EID výsledku v databázi Scopus
2-s2.0-85147606124