All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Enhancing Cardiovascular Risk Assessment with Advanced Data Balancing and Domain Knowledge-driven Explainability

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216275%3A25410%2F24%3A39922247" target="_blank" >RIV/00216275:25410/24:39922247 - isvavai.cz</a>

  • Result on the web

    <a href="https://www.sciencedirect.com/science/article/pii/S0957417424017536" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0957417424017536</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1016/j.eswa.2024.124886" target="_blank" >10.1016/j.eswa.2024.124886</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Enhancing Cardiovascular Risk Assessment with Advanced Data Balancing and Domain Knowledge-driven Explainability

  • Original language description

    In medical risk prediction, such as predicting heart disease, machine learning (ML) classifiers must achieve high accuracy, precision, and recall to minimize the chances of incorrect diagnoses or treatment recommendations. However, real-world datasets often have imbalanced data, which can affect classifier performance. Traditional data balancing methods can lead to overfitting and underfitting, making it difficult to identify potential health risks accurately. Early prediction of heart attacks is of paramount importance, and researchers have developed ML-based systems to address this problem. However, much of the existing ML research is based on a single dataset, often ignoring performance evaluation across multiple datasets. As the demand for interpretable ML models grows, model interpretability becomes central to revealing insights and feature effects within predictive models. To address these challenges, we present a novel data balancing technique that uses a divide-and- conquer strategy with the K-Means clustering algorithm to segment the dataset. The performance of our approach is highlighted through comparisons with established techniques, which demonstrate the superiority of our proposed method. To address the challenge of inter-dataset discrepancies, we use two different datasets. Our holistic pipeline, strengthened by the innovative balancing technique, effectively addresses performance discrepancies, culminating in a significant improvement from 81% to 90%. Furthermore, through advanced statistical analysis, it has been determined that the 95% confidence interval for the AUC metric of our method ranges from 0.8187 to 0.8411. This observation serves to underscore the consistency and reliability of our approach, demonstrating its ability to achieve high performance across a range of scenarios. Incorporating Explainable AI (XAI), we examine the feature rankings and their contributions within the best performing Random Forest model. While the domain expert feedback is consistent with the explanatory power of XAI, some differences remain. Nevertheless, a remarkable convergence in feature ranking and weighting is observed, bridging the insights from XAI tools and domain expert perspectives.

  • Czech name

  • Czech description

Classification

  • Type

    J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

  • Continuities

    I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Others

  • Publication year

    2024

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Name of the periodical

    Expert Systems with Applications

  • ISSN

    0957-4174

  • e-ISSN

    1873-6793

  • Volume of the periodical

    255

  • Issue of the periodical within the volume

    December

  • Country of publishing house

    GB - UNITED KINGDOM

  • Number of pages

    20

  • Pages from-to

    124886

  • UT code for WoS article

    001286672200001

  • EID of the result in the Scopus database

    2-s2.0-85200145279