A hybrid supervised machine learning classifier system for breast cancer prognosis using feature selection and data imbalance handling approaches
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F21%3A10247374" target="_blank" >RIV/61989100:27240/21:10247374 - isvavai.cz</a>
Result on the web
<a href="https://www.mdpi.com/2079-9292/10/6/699" target="_blank" >https://www.mdpi.com/2079-9292/10/6/699</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/electronics10060699" target="_blank" >10.3390/electronics10060699</a>
Alternative languages
Result language
angličtina
Original language name
A hybrid supervised machine learning classifier system for breast cancer prognosis using feature selection and data imbalance handling approaches
Original language description
Nowadays, breast cancer is the most frequent cancer among women. Early detection is a critical issue that can be effectively achieved by machine learning (ML) techniques. Thus in this article, the methods to improve the accuracy of ML classification models for the prognosis of breast cancer are investigated. Wrapper-based feature selection approach along with nature-inspired algorithms such as Particle Swarm Optimization, Genetic Search, and Greedy Stepwise has been used to identify the important features. On these selected features popular machine learning classifiers Support Vector Machine, J48 (C4.5 Decision Tree Algorithm), Multilayer-Perceptron (a feed-forward ANN) were used in the system. The methodology of the proposed system is structured into five stages which include (1) Data Pre-processing; (2) Data imbalance handling; (3) Feature Selection; (4) Machine Learning Classifiers; (5) classifier's performance evaluation. The dataset under this research experimentation is referred from the UCI Machine Learning Repository, named Breast Cancer Wisconsin (Diagnostic) Data Set. This article indicated that the J48 decision tree classifier is the appropriate machine learning-based classifier for optimum breast cancer prognosis. Support Vector Machine with Particle Swarm Optimization algorithm for feature selection achieves the accuracy of 98.24%, MCC = 0.961, Sensitivity = 99.11%, Specificity = 96.54%, and Kappa statistics of 0.9606. It is also observed that the J48 Decision Tree classifier with the Genetic Search algorithm for feature selection achieves the accuracy of 98.83%, MCC = 0.974, Sensitivity = 98.95%, Specificity = 98.58%, and Kappa statistics of 0.9735. Furthermore, Multilayer Perceptron ANN classifier with Genetic Search algorithm for feature selection achieves the accuracy of 98.59%, MCC = 0.968, Sensitivity = 98.6%, Specificity = 98.57%, and Kappa statistics of 0.9682. (C) 2021 by the authors. Licensee MDPI, Basel, Switzerland.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
20201 - Electrical and electronic engineering
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Electronics
ISSN
2079-9292
e-ISSN
—
Volume of the periodical
10
Issue of the periodical within the volume
6
Country of publishing house
CH - SWITZERLAND
Number of pages
16
Pages from-to
1-16
UT code for WoS article
000634365700001
EID of the result in the Scopus database
2-s2.0-85102448252