Exploratory tools for outlier detection in compositional data with structural zeros
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989592%3A15310%2F17%3A73582557" target="_blank" >RIV/61989592:15310/17:73582557 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.1080/02664763.2016.1182135" target="_blank" >http://dx.doi.org/10.1080/02664763.2016.1182135</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1080/02664763.2016.1182135" target="_blank" >10.1080/02664763.2016.1182135</a>
Alternative languages
Result language
angličtina
Original language name
Exploratory tools for outlier detection in compositional data with structural zeros
Original language description
The analysis of compositional data using the log-ratio approach is based on ratios between the compositional parts. Zeros in the parts thus cause serious difficulties for the analysis. This is a particular problem in case of structural zeros, which cannot be simply replaced by a non-zero value as it is done, e.g. for values below detection limit or missing values. Instead, zeros to be incorporated into further statistical processing. The focus is on exploratory tools for identifying outliers in compositional data sets with structural zeros. For this purpose, Mahalanobis distances are estimated, computed either directly for subcompositions determined by their zero patterns, or by using imputation to improve the efficiency of the estimates, and then proceed to the subcompositional and subgroup level. For this approach, new theory is formulated that allows to estimate covariances for imputed compositional data and to apply estimations on subgroups using parts of this covariance matrix. Moreover, the zero pattern structure is analyzed using principal component analysis for binary data to achieve a comprehensive view of the overall multivariate data structure. The proposed tools are applied to larger compositional data sets from official statistics, where the need for an appropriate treatment of zeros is obvious.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10103 - Statistics and probability
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2017
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Journal of Applied Statistics
ISSN
0266-4763
e-ISSN
—
Volume of the periodical
44
Issue of the periodical within the volume
4
Country of publishing house
GB - UNITED KINGDOM
Number of pages
19
Pages from-to
734-752
UT code for WoS article
000396038500011
EID of the result in the Scopus database
—