A new method for correlation analysis of compositional (environmental) data – a worked example
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989592%3A15310%2F17%3A73582563" target="_blank" >RIV/61989592:15310/17:73582563 - isvavai.cz</a>
Result on the web
<a href="https://www.sciencedirect.com/science/article/pii/S0048969717314675" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0048969717314675</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.scitotenv.2017.06.063" target="_blank" >10.1016/j.scitotenv.2017.06.063</a>
Alternative languages
Result language
angličtina
Original language name
A new method for correlation analysis of compositional (environmental) data – a worked example
Original language description
Most data in environmental sciences and geochemistry are compositional. Already the unit used to report the data (e.g., μg/l, mg/kg,wt%) implies that the analytical results for each element are not free to vary independently of the other measured variables. This is often neglected in statistical analysis, where a simple log-transformation of the single variables is insufficient to put the data into an acceptable geometry. This is also important for bivariate data analysis and for correlation analysis, for which the data need to be appropriately log-ratio transformed. A newapproach based on the isometric log-ratio (ilr) transformation, leading to so-called symmetric coordinates, is presented here. Summarizing the correlations in a heat-map gives a powerful tool for bivariate data analysis. Here an application of the new method using a data set from a regional geochemical mapping project based on soil O and C horizon samples is demonstrated. Differences to ‘classical’ correlation analysis based on log-transformed data are highlighted. The fact that some expected strong positive correlations appear and remain unchanged even following a log-ratio transformation has probably led to the misconception that the special nature of compositional data can be ignored when working with trace elements. The example dataset is employed to demonstrate that using ‘classical’ correlation analysis and plotting XY diagrams, scatterplots, based on the original or simply log-transformed data can easily lead to severe misinterpretations of the relationships between elements.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10103 - Statistics and probability
Result continuities
Project
—
Continuities
N - Vyzkumna aktivita podporovana z neverejnych zdroju
Others
Publication year
2017
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Science of the Total Environment
ISSN
0048-9697
e-ISSN
—
Volume of the periodical
607-608
Issue of the periodical within the volume
DEC
Country of publishing house
NL - THE KINGDOM OF THE NETHERLANDS
Number of pages
7
Pages from-to
965-971
UT code for WoS article
000408755300096
EID of the result in the Scopus database
—