Weighting the domain of probability densities in functional data analysis
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989592%3A15310%2F20%3A73604821" target="_blank" >RIV/61989592:15310/20:73604821 - isvavai.cz</a>
Výsledek na webu
<a href="https://obd.upol.cz/id_publ/333184707" target="_blank" >https://obd.upol.cz/id_publ/333184707</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1002/sta4.283" target="_blank" >10.1002/sta4.283</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Weighting the domain of probability densities in functional data analysis
Popis výsledku v původním jazyce
In functional data analysis, some regions of the domain of the functions can be of more interest than others owing to the quality of measurement, relative scale of the domain, or simply some external reason (e.g. interest of stakeholders). Weighting the domain is of interest particularly with probability density functions (PDFs), as derived from distributional data, which often aggregate measurements of different quality or are affected by scale effects. A weighting scheme can be embedded into the underlying sample space of a PDF when it is considered as continuous compositions applying the theory of Bayes spaces. The origin of a Bayes space is determined by a given reference measure, and this can be easily changed through the well-known chain rule. This work provides a formal framework for defining weights through a reference measure, and it is used to develop a weighting scheme on the bounded domain of distributional data. The impact on statistical analysis is illustrated through an application to functional principal component analysis of income distribution data. Moreover, a novel centred log-ratio transformation is proposed to map a weighted Bayes space into an unweighted L2 space, enabling to use most tools developed in functional data analysis (e.g. clustering and regression analysis) while accounting for the weighting scheme. The potential of our proposal is shown on a real case study using Italian income data.
Název v anglickém jazyce
Weighting the domain of probability densities in functional data analysis
Popis výsledku anglicky
In functional data analysis, some regions of the domain of the functions can be of more interest than others owing to the quality of measurement, relative scale of the domain, or simply some external reason (e.g. interest of stakeholders). Weighting the domain is of interest particularly with probability density functions (PDFs), as derived from distributional data, which often aggregate measurements of different quality or are affected by scale effects. A weighting scheme can be embedded into the underlying sample space of a PDF when it is considered as continuous compositions applying the theory of Bayes spaces. The origin of a Bayes space is determined by a given reference measure, and this can be easily changed through the well-known chain rule. This work provides a formal framework for defining weights through a reference measure, and it is used to develop a weighting scheme on the bounded domain of distributional data. The impact on statistical analysis is illustrated through an application to functional principal component analysis of income distribution data. Moreover, a novel centred log-ratio transformation is proposed to map a weighted Bayes space into an unweighted L2 space, enabling to use most tools developed in functional data analysis (e.g. clustering and regression analysis) while accounting for the weighting scheme. The potential of our proposal is shown on a real case study using Italian income data.
Klasifikace
Druh
J<sub>SC</sub> - Článek v periodiku v databázi SCOPUS
CEP obor
—
OECD FORD obor
10103 - Statistics and probability
Návaznosti výsledku
Projekt
<a href="/cs/project/GA19-01768S" target="_blank" >GA19-01768S: Separace geochemických signálů v sedimentech: aplikace pokročilých statistických metod na rozsáhlé geochemické datové soubory</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2020
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Stat
ISSN
2049-1573
e-ISSN
—
Svazek periodika
9
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
13
Strana od-do
"e283-1"-"e283-13"
Kód UT WoS článku
000614806100027
EID výsledku v databázi Scopus
2-s2.0-85094180322