A Sparse Pair-preserving Centroid-based Supervised Learning Method for High-dimensional Biomedical Data or Images
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F67985807%3A_____%2F20%3A00524330" target="_blank" >RIV/67985807:_____/20:00524330 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1016/j.bbe.2020.03.008" target="_blank" >http://dx.doi.org/10.1016/j.bbe.2020.03.008</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.bbe.2020.03.008" target="_blank" >10.1016/j.bbe.2020.03.008</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
A Sparse Pair-preserving Centroid-based Supervised Learning Method for High-dimensional Biomedical Data or Images
Popis výsledku v původním jazyce
In various biomedical applications designed to compare two groups (e.g. patients and controls in matched case-control studies), it is often desirable to perform a dimensionality reduction in order to learn a classification rule over high-dimensional data. This paper considers a centroid-based classification method for paired data, which at the same time performs a supervised variable selection respecting the matched pairs design. We propose an algorithm for optimizing the centroid (prototype, template). A subsequent optimization of weights for the centroid ensures sparsity, robustness to outliers, and clear interpretation of the contribution of individual variables to the classification task. We apply the method to a simulated matched case-control study dataset, to a gene expression study of acute myocardial infarction, and to mouth localization in 2D facial images. The novel approach yields a comparable performance with standard classifiers and outperforms them if the data are contaminated by outliers. This robustness makes the method relevant for genomic, metabolomic or proteomic high-dimensional data (in matched case-control studies) or medical diagnostics based on images, as (excessive) noise and contamination are ubiquitous in biomedical measurements.
Název v anglickém jazyce
A Sparse Pair-preserving Centroid-based Supervised Learning Method for High-dimensional Biomedical Data or Images
Popis výsledku anglicky
In various biomedical applications designed to compare two groups (e.g. patients and controls in matched case-control studies), it is often desirable to perform a dimensionality reduction in order to learn a classification rule over high-dimensional data. This paper considers a centroid-based classification method for paired data, which at the same time performs a supervised variable selection respecting the matched pairs design. We propose an algorithm for optimizing the centroid (prototype, template). A subsequent optimization of weights for the centroid ensures sparsity, robustness to outliers, and clear interpretation of the contribution of individual variables to the classification task. We apply the method to a simulated matched case-control study dataset, to a gene expression study of acute myocardial infarction, and to mouth localization in 2D facial images. The novel approach yields a comparable performance with standard classifiers and outperforms them if the data are contaminated by outliers. This robustness makes the method relevant for genomic, metabolomic or proteomic high-dimensional data (in matched case-control studies) or medical diagnostics based on images, as (excessive) noise and contamination are ubiquitous in biomedical measurements.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/GA19-05704S" target="_blank" >GA19-05704S: FoNeCo: Analytické základy neurovýpočtů</a><br>
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2020
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Biocybernetics and Biomedical Engineering
ISSN
0208-5216
e-ISSN
—
Svazek periodika
40
Číslo periodika v rámci svazku
2
Stát vydavatele periodika
PL - Polská republika
Počet stran výsledku
13
Strana od-do
774-786
Kód UT WoS článku
000547542400014
EID výsledku v databázi Scopus
2-s2.0-85084491501