Applying the Deep Learning Techniques to Solve Classification Tasks Using Gene Expression Data
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F44555601%3A13440%2F24%3A43898369" target="_blank" >RIV/44555601:13440/24:43898369 - isvavai.cz</a>
Result on the web
<a href="https://ieeexplore.ieee.org/document/10440636" target="_blank" >https://ieeexplore.ieee.org/document/10440636</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ACCESS.2024.3368070" target="_blank" >10.1109/ACCESS.2024.3368070</a>
Alternative languages
Result language
angličtina
Original language name
Applying the Deep Learning Techniques to Solve Classification Tasks Using Gene Expression Data
Original language description
This manuscript explores the application of deep learning (DL) techniques for classifying gene expression data. A key aspect of our research is the comparative analysis of various DL neural network architectures, including Convolution Neural Networks (CNN), Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) Recurrent Neural Networks (RNN), as well as hybrid models that combine these networks. We applied the Bayesian optimization algorithm using 5-fold cross-validation for optimal hyperparameter tuning, which is crucial for DL algorithm performance. Significantly, we have advanced the methods for applying RNNs in processing gene expression data, particularly focusing on LSTM and GRU types. Our study introduces also a novel hybrid quality criterion for data classification, calculated as a weighted sum of partial quality criteria, incorporating an integrated F1-score derived through the Harrington desirability method. Furthermore, we investigate hybrid models that leverage various DL methods, enhancing decision-making objectivity in sample identification. This model uses a step-by-step information processing procedure, initially applying different DL models to gene expression data and subsequently processing these through a CART-based classifier for final decision-making. Our experiments, performed on gene expression data from patients with eight cancer types and one subset with normal samples (without cancer), demonstrated that GRU-RNN-based models, particularly a two-layer GRU-RNN, achieved the highest classification efficacy, with an accuracy of 97.8% on the test dataset. The performance of this model exceeded that of other models, whose accuracy varied between 96.6% and 97.3%. Comparative analysis with other studies in this field suggests that the proposed techniques demonstrate higher efficacy compared to similar research regarding the application of DL models for cancer-type diagnosis.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
IEEE Access
ISSN
2169-3536
e-ISSN
—
Volume of the periodical
2024
Issue of the periodical within the volume
12
Country of publishing house
US - UNITED STATES
Number of pages
12
Pages from-to
28437-28448
UT code for WoS article
001174249000001
EID of the result in the Scopus database
2-s2.0-85186090110