Principal Component Analysis for Distributions Observed by Samples in Bayes Spaces
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989592%3A15310%2F24%3A73627773" target="_blank" >RIV/61989592:15310/24:73627773 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/article/10.1007/s11004-024-10142-9" target="_blank" >https://link.springer.com/article/10.1007/s11004-024-10142-9</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s11004-024-10142-9" target="_blank" >10.1007/s11004-024-10142-9</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Principal Component Analysis for Distributions Observed by Samples in Bayes Spaces
Popis výsledku v původním jazyce
Distributional data have recently become increasingly important for understanding processes in the geosciences, thanks to the establishment of cost-efficient analytical instruments capable of measuring properties over large numbers of particles, grains or crystals in a sample. Functional data analysis allows the direct application of multivariate methods, such as principal component analysis, to such distributions. However, these are often observed in the form of samples, and thus incur a sampling error. This additional sampling error changes the properties of the multivariate variance and thus the number of relevant principal components and their direction. The result of the principal component analysis becomes an artifact of the sampling error and can negatively affect the subsequent data analysis. This work presents a way of estimating this sampling error and how to confront it in the context of principal component analysis, where the principal components are obtained as a linear combination of elements of a newly constructed orthogonal spline basis. The effect of the sampling error and th effectiveness of the correction is demonstrated with a series of simulations. It is shown how the interpretability and reproducibility of the principal components improve and become independent of the selection of the basis. The proposed method is then applied on a dataset of grain size distributions in a geometallurgical dataset from Thaba mine in the Bushveld complex.
Název v anglickém jazyce
Principal Component Analysis for Distributions Observed by Samples in Bayes Spaces
Popis výsledku anglicky
Distributional data have recently become increasingly important for understanding processes in the geosciences, thanks to the establishment of cost-efficient analytical instruments capable of measuring properties over large numbers of particles, grains or crystals in a sample. Functional data analysis allows the direct application of multivariate methods, such as principal component analysis, to such distributions. However, these are often observed in the form of samples, and thus incur a sampling error. This additional sampling error changes the properties of the multivariate variance and thus the number of relevant principal components and their direction. The result of the principal component analysis becomes an artifact of the sampling error and can negatively affect the subsequent data analysis. This work presents a way of estimating this sampling error and how to confront it in the context of principal component analysis, where the principal components are obtained as a linear combination of elements of a newly constructed orthogonal spline basis. The effect of the sampling error and th effectiveness of the correction is demonstrated with a series of simulations. It is shown how the interpretability and reproducibility of the principal components improve and become independent of the selection of the basis. The proposed method is then applied on a dataset of grain size distributions in a geometallurgical dataset from Thaba mine in the Bushveld complex.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10102 - Applied mathematics
Návaznosti výsledku
Projekt
<a href="/cs/project/GF22-15684L" target="_blank" >GF22-15684L: Zobecněná relativní data a robustnost v Bayesových prostorech</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Mathematical Geosciences
ISSN
1874-8961
e-ISSN
1874-8953
Svazek periodika
56
Číslo periodika v rámci svazku
8
Stát vydavatele periodika
DE - Spolková republika Německo
Počet stran výsledku
29
Strana od-do
1641-1669
Kód UT WoS článku
001216033100001
EID výsledku v databázi Scopus
2-s2.0-85192019581