Weighting of parts in compositional data analysis: Advances and applications
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989592%3A15310%2F22%3A73615127" target="_blank" >RIV/61989592:15310/22:73615127 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/article/10.1007/s11004-021-09952-y" target="_blank" >https://link.springer.com/article/10.1007/s11004-021-09952-y</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s11004-021-09952-y" target="_blank" >10.1007/s11004-021-09952-y</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Weighting of parts in compositional data analysis: Advances and applications
Popis výsledku v původním jazyce
It often occurs in practice that it is sensible to give different weights to the variables involved in a multivariate data analysis and the same holds for compositional data as multivariate observations carrying relative information. It can be convenient to apply weights to better accommodate differences in the quality of the measurements, the occurrence of zeros and missing values, or generally to highlight some specific features of compositional parts. The characterisation of compositional data as elements of a Bayes space, which is as a natural generalisation of the ordinary Aitchison geometry, enables the definition of a formal framework to implement weighting schemes for the parts of a composition. This is formally achieved by considering a reference measure in the Bayes space alternative to the common uniform measure via the well-known chain rule. Unweighted centred logratio (clr) coefficients and isometric logratio (ilr) coordinates then allow to express compositions in the real space equipped with the (unweighted) Euclidean geometry. The resulting elements of the real space generated by the clr coefficients or ilr coordinates are invariant to the scale of the original compositions, but the actual scale of the weights matters. In this work these formal developments are presented and used to introduce a general approach for weighting parts in compositional data analysis. The practical use is demonstrated on simulated and real-world data sets in the context of the earth sciences.
Název v anglickém jazyce
Weighting of parts in compositional data analysis: Advances and applications
Popis výsledku anglicky
It often occurs in practice that it is sensible to give different weights to the variables involved in a multivariate data analysis and the same holds for compositional data as multivariate observations carrying relative information. It can be convenient to apply weights to better accommodate differences in the quality of the measurements, the occurrence of zeros and missing values, or generally to highlight some specific features of compositional parts. The characterisation of compositional data as elements of a Bayes space, which is as a natural generalisation of the ordinary Aitchison geometry, enables the definition of a formal framework to implement weighting schemes for the parts of a composition. This is formally achieved by considering a reference measure in the Bayes space alternative to the common uniform measure via the well-known chain rule. Unweighted centred logratio (clr) coefficients and isometric logratio (ilr) coordinates then allow to express compositions in the real space equipped with the (unweighted) Euclidean geometry. The resulting elements of the real space generated by the clr coefficients or ilr coordinates are invariant to the scale of the original compositions, but the actual scale of the weights matters. In this work these formal developments are presented and used to introduce a general approach for weighting parts in compositional data analysis. The practical use is demonstrated on simulated and real-world data sets in the context of the earth sciences.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10103 - Statistics and probability
Návaznosti výsledku
Projekt
<a href="/cs/project/GA19-01768S" target="_blank" >GA19-01768S: Separace geochemických signálů v sedimentech: aplikace pokročilých statistických metod na rozsáhlé geochemické datové soubory</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Mathematical Geosciences
ISSN
1874-8961
e-ISSN
1874-8953
Svazek periodika
54
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
DE - Spolková republika Německo
Počet stran výsledku
23
Strana od-do
71-93
Kód UT WoS článku
000669833300001
EID výsledku v databázi Scopus
2-s2.0-85109318084