A Hybrid Model of Cancer Diseases Diagnosis Based on Gene Expression Data with Joint Use of Data Mining Methods and Machine Learning Techniques
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F44555601%3A13440%2F23%3A43897684" target="_blank" >RIV/44555601:13440/23:43897684 - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/record/display.uri?eid=2-s2.0-85160859759&origin=resultslist&sort=plf-f#metrics" target="_blank" >https://www.scopus.com/record/display.uri?eid=2-s2.0-85160859759&origin=resultslist&sort=plf-f#metrics</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/app13106022" target="_blank" >10.3390/app13106022</a>
Alternative languages
Result language
angličtina
Original language name
A Hybrid Model of Cancer Diseases Diagnosis Based on Gene Expression Data with Joint Use of Data Mining Methods and Machine Learning Techniques
Original language description
One of the current focuses of modern bioinformatics is the development of hybrid models to process gene expression data, in order to create diagnostic systems for various diseases. In this study, we propose a solution to this problem that combines an inductive spectral clustering algorithm, random forest classifier, convolutional neural network, and alternative voting method for making the final decision about patient condition. In the first stage, we apply the spectral clustering algorithm to gene expression profiles using inductive methods of objective clustering, with the calculation of internal, external, and balance clustering quality criteria. This results in clusters of mutually correlated and differently expressed gene expression profiles. In the second stage, we apply the random forest classifier and convolutional neural network to identify the examined objects, containing as attributes the gene expression values in the allocated clusters.The presented research solves both binary- and multi-classification tasks. The final decision about the patient?s condition is made using the alternative voting method, considering the classification results based on the gene expression data in various clusters. The simulation results showed that the proposed technique was highly effective, achieving a high accuracy in object identification when both classifiers were used. However, the convolutional neural network had a significantly higher data processing efficiency than the random forest algorithm, due to its substantially shorter processing time.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Applied Sciences
ISSN
2076-3417
e-ISSN
2076-3417
Volume of the periodical
13
Issue of the periodical within the volume
10
Country of publishing house
CH - SWITZERLAND
Number of pages
19
Pages from-to
"nestrankovano"
UT code for WoS article
000995665800001
EID of the result in the Scopus database
—