Principal balances of compositional data for regression and classification using partial least squares
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989592%3A15310%2F23%3A73622796" target="_blank" >RIV/61989592:15310/23:73622796 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/61989592:15110/23:73622796 RIV/62690094:18450/23:50020789 RIV/00098892:_____/23:10158301
Výsledek na webu
<a href="https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3518" target="_blank" >https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3518</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1002/cem.3518" target="_blank" >10.1002/cem.3518</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Principal balances of compositional data for regression and classification using partial least squares
Popis výsledku v původním jazyce
High-dimensional compositional data are commonplace in the modern omics sciences, among others. Analysis of compositional data requires the proper choice of a log-ratio coordinate representation, since their relative nature is not compatible with the direct use of standard statistical methods. Principal balances, a particular class of orthonormal log-ratio coordinates, are well suited to this context as they are constructed so that the first few coordinates capture most of the compositional variability of data set. Focusing on regression and classification problems in high dimensions, we propose a novel partial least squares (PLS) procedure to construct principal balances that maximize the explained variability of the response variable and notably ease interpretability when compared to the ordinary PLS formulation. The proposed PLS principal balance approach can be understood as a generalized version of common log contrast models since, instead of just one, multiple orthonormal log-contrasts are estimated simultaneously. We demonstrate the performance of the proposed method using both simulated and empirical data sets.
Název v anglickém jazyce
Principal balances of compositional data for regression and classification using partial least squares
Popis výsledku anglicky
High-dimensional compositional data are commonplace in the modern omics sciences, among others. Analysis of compositional data requires the proper choice of a log-ratio coordinate representation, since their relative nature is not compatible with the direct use of standard statistical methods. Principal balances, a particular class of orthonormal log-ratio coordinates, are well suited to this context as they are constructed so that the first few coordinates capture most of the compositional variability of data set. Focusing on regression and classification problems in high dimensions, we propose a novel partial least squares (PLS) procedure to construct principal balances that maximize the explained variability of the response variable and notably ease interpretability when compared to the ordinary PLS formulation. The proposed PLS principal balance approach can be understood as a generalized version of common log contrast models since, instead of just one, multiple orthonormal log-contrasts are estimated simultaneously. We demonstrate the performance of the proposed method using both simulated and empirical data sets.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10103 - Statistics and probability
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
JOURNAL OF CHEMOMETRICS
ISSN
0886-9383
e-ISSN
1099-128X
Svazek periodika
37
Číslo periodika v rámci svazku
12
Stát vydavatele periodika
GB - Spojené království Velké Británie a Severního Irska
Počet stran výsledku
22
Strana od-do
"e3518-1"-"e3518-22"
Kód UT WoS článku
001114643400005
EID výsledku v databázi Scopus
2-s2.0-85171649327