Identification of Triple Negative Breast Cancer Genes Using Rough Set Based Feature Selection Algorithm & Ensemble Classifier

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F22%3A10250799" target="_blank" >RIV/61989100:27240/22:10250799 - isvavai.cz</a>
Výsledek na webu
<a href="http://hcisj.com/data/file/article/2022110004/12-54.pdf" target="_blank" >http://hcisj.com/data/file/article/2022110004/12-54.pdf</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.22967/HCIS.2022.12.054" target="_blank" >10.22967/HCIS.2022.12.054</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Identification of Triple Negative Breast Cancer Genes Using Rough Set Based Feature Selection Algorithm & Ensemble Classifier
Popis výsledku v původním jazyce
In recent decades, microarray datasets have played an important role in triple negative breast cancer (TNBC) detection. Microarray data classification is a challenging process due to the presence of numerous redundant and irrelevant features. Therefore, feature selection becomes irreplaceable in this research field that eliminates non-required feature vectors from the system. The selection of an optimal number of features significantly reduces the NP hard problem, so a rough set-based feature selection algorithm is used in this manuscript for selecting the optimal feature values. Initially, the datasets related to TNBC are acquired from gene expression omnibuses like GSE45827, GSE76275, GSE65194, GSE3744, GSE21653, and GSE7904. Then, a robust multi-array average technique is used for eliminating the outlier samples of TNBC/non-TNBC which helps enhancing classification performance. Further, the pre-processed microarray data are fed to a rough set theory for optimal gene selection, and then the selected genes are given as the inputs to the ensemble classification technique for classifying low-risk genes (non-TNBC) and high-risk genes (TNBC). The experimental evaluation showed that the ensemble-based rough set model obtained a mean accuracy of 97.24%, which superior related to other comparative machine learning techniques.
Název v anglickém jazyce
Identification of Triple Negative Breast Cancer Genes Using Rough Set Based Feature Selection Algorithm & Ensemble Classifier
Popis výsledku anglicky
In recent decades, microarray datasets have played an important role in triple negative breast cancer (TNBC) detection. Microarray data classification is a challenging process due to the presence of numerous redundant and irrelevant features. Therefore, feature selection becomes irreplaceable in this research field that eliminates non-required feature vectors from the system. The selection of an optimal number of features significantly reduces the NP hard problem, so a rough set-based feature selection algorithm is used in this manuscript for selecting the optimal feature values. Initially, the datasets related to TNBC are acquired from gene expression omnibuses like GSE45827, GSE76275, GSE65194, GSE3744, GSE21653, and GSE7904. Then, a robust multi-array average technique is used for eliminating the outlier samples of TNBC/non-TNBC which helps enhancing classification performance. Further, the pre-processed microarray data are fed to a rough set theory for optimal gene selection, and then the selected genes are given as the inputs to the ensemble classification technique for classifying low-risk genes (non-TNBC) and high-risk genes (TNBC). The experimental evaluation showed that the ensemble-based rough set model obtained a mean accuracy of 97.24%, which superior related to other comparative machine learning techniques.

Klasifikace

Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
20200 - Electrical engineering, Electronic engineering, Information engineering

Návaznosti výsledku

Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Human-centric Computing and Information Sciences
ISSN
2192-1962
e-ISSN
—
Svazek periodika
12
Číslo periodika v rámci svazku
54
Stát vydavatele periodika
KR - Korejská republika
Počet stran výsledku
15
Strana od-do
nestrankovano
Kód UT WoS článku
000890282100001
EID výsledku v databázi Scopus
—

Podobné výsledky(10)

Gene Ontology Driven Feature Filtering from Microarray Data Není k dispozici Text-based feature selection using binary particle swarm optimization for sentiment analysis

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Identification of Triple Negative Breast Cancer Genes Using Rough Set Based Feature Selection Algorithm & Ensemble Classifier

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)