Generalised linear model-based algorithm for detection of outliers in environmental data and comparison with semi-parametric outlier detection methods
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62156489%3A43110%2F19%3A43915809" target="_blank" >RIV/62156489:43110/19:43915809 - isvavai.cz</a>
Alternative codes found
RIV/60162694:G42__/19:00536896
Result on the web
<a href="https://doi.org/10.1016/j.apr.2019.01.010" target="_blank" >https://doi.org/10.1016/j.apr.2019.01.010</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.apr.2019.01.010" target="_blank" >10.1016/j.apr.2019.01.010</a>
Alternative languages
Result language
angličtina
Original language name
Generalised linear model-based algorithm for detection of outliers in environmental data and comparison with semi-parametric outlier detection methods
Original language description
Outliers are often present in large datasets of air pollutant concentrations. Existing methods for detection of outliers in environmental data can be divided as follows into three groups depending on the character of the data: methods for time series, methods for time series measured simultaneously with accompanying variables and methods for spatial data. A number of methods suggested for the automatic detection of outliers in time series data are limited by assumptions of known distribution of the analysed variable. Since the environmental variables are often influenced by accompanying factors their distribution is difficult to estimate. Considering the known information about accompanying variables and using appropriate methods for detection of outliers in time series measured simultaneously with accompanying variables can be a significant improvement in outlier detection approaches. This paper presents a method for the automatic detection of outliers in PM10 aerosols measured simultaneously with accompanying variables. The method is based on generalised linear model and subsequent analysis of the residuals. The method makes use of the benefits from the additional information included in the accessibility of accompanying variables. The results of the suggested procedure are compared with the results obtained using two distribution-free outlier detection methods for time series formerly suggested by the authors. The simulations-based comparison of the performance of all three procedures showed that the procedure presented in this paper effectively detects outliers that deviate at least 5 standard deviations from the mean value of the neighbouring observations and outperforms both distribution-free outlier detection methods for time series.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10103 - Statistics and probability
Result continuities
Project
—
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Atmospheric Pollution Research
ISSN
1309-1042
e-ISSN
—
Volume of the periodical
10
Issue of the periodical within the volume
4
Country of publishing house
TR - TURKEY
Number of pages
9
Pages from-to
1015-1023
UT code for WoS article
000472996900002
EID of the result in the Scopus database
2-s2.0-85067862378