A new method for correlation analysis of compositional (environmental) data – a worked example
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989592%3A15310%2F17%3A73582563" target="_blank" >RIV/61989592:15310/17:73582563 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.sciencedirect.com/science/article/pii/S0048969717314675" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0048969717314675</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.scitotenv.2017.06.063" target="_blank" >10.1016/j.scitotenv.2017.06.063</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
A new method for correlation analysis of compositional (environmental) data – a worked example
Popis výsledku v původním jazyce
Most data in environmental sciences and geochemistry are compositional. Already the unit used to report the data (e.g., μg/l, mg/kg,wt%) implies that the analytical results for each element are not free to vary independently of the other measured variables. This is often neglected in statistical analysis, where a simple log-transformation of the single variables is insufficient to put the data into an acceptable geometry. This is also important for bivariate data analysis and for correlation analysis, for which the data need to be appropriately log-ratio transformed. A newapproach based on the isometric log-ratio (ilr) transformation, leading to so-called symmetric coordinates, is presented here. Summarizing the correlations in a heat-map gives a powerful tool for bivariate data analysis. Here an application of the new method using a data set from a regional geochemical mapping project based on soil O and C horizon samples is demonstrated. Differences to ‘classical’ correlation analysis based on log-transformed data are highlighted. The fact that some expected strong positive correlations appear and remain unchanged even following a log-ratio transformation has probably led to the misconception that the special nature of compositional data can be ignored when working with trace elements. The example dataset is employed to demonstrate that using ‘classical’ correlation analysis and plotting XY diagrams, scatterplots, based on the original or simply log-transformed data can easily lead to severe misinterpretations of the relationships between elements.
Název v anglickém jazyce
A new method for correlation analysis of compositional (environmental) data – a worked example
Popis výsledku anglicky
Most data in environmental sciences and geochemistry are compositional. Already the unit used to report the data (e.g., μg/l, mg/kg,wt%) implies that the analytical results for each element are not free to vary independently of the other measured variables. This is often neglected in statistical analysis, where a simple log-transformation of the single variables is insufficient to put the data into an acceptable geometry. This is also important for bivariate data analysis and for correlation analysis, for which the data need to be appropriately log-ratio transformed. A newapproach based on the isometric log-ratio (ilr) transformation, leading to so-called symmetric coordinates, is presented here. Summarizing the correlations in a heat-map gives a powerful tool for bivariate data analysis. Here an application of the new method using a data set from a regional geochemical mapping project based on soil O and C horizon samples is demonstrated. Differences to ‘classical’ correlation analysis based on log-transformed data are highlighted. The fact that some expected strong positive correlations appear and remain unchanged even following a log-ratio transformation has probably led to the misconception that the special nature of compositional data can be ignored when working with trace elements. The example dataset is employed to demonstrate that using ‘classical’ correlation analysis and plotting XY diagrams, scatterplots, based on the original or simply log-transformed data can easily lead to severe misinterpretations of the relationships between elements.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10103 - Statistics and probability
Návaznosti výsledku
Projekt
—
Návaznosti
N - Vyzkumna aktivita podporovana z neverejnych zdroju
Ostatní
Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Science of the Total Environment
ISSN
0048-9697
e-ISSN
—
Svazek periodika
607-608
Číslo periodika v rámci svazku
DEC
Stát vydavatele periodika
NL - Nizozemsko
Počet stran výsledku
7
Strana od-do
965-971
Kód UT WoS článku
000408755300096
EID výsledku v databázi Scopus
—