Semiparametric outlier detection in nonstationary times series: Case study for atmospheric pollution in Brno, Czech Republic
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62156489%3A43110%2F18%3A43912662" target="_blank" >RIV/62156489:43110/18:43912662 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/60162694:G42__/18:00534192 RIV/00216305:26110/18:PU123965
Výsledek na webu
<a href="https://doi.org/10.1016/j.apr.2017.06.005" target="_blank" >https://doi.org/10.1016/j.apr.2017.06.005</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.apr.2017.06.005" target="_blank" >10.1016/j.apr.2017.06.005</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Semiparametric outlier detection in nonstationary times series: Case study for atmospheric pollution in Brno, Czech Republic
Popis výsledku v původním jazyce
Large environmental datasets usually include outliers which can have significant effects on further analysis and modelling. There exist various outlier detection methods that depend on the distribution of the analysed variable. However quite often the distribution of environmental variables can not be estimated. This paper presents an approach for identification of outliers in environmental time series which does not impose restrictions on the distribution of observed variables. The suggested algorithm combines kernel smoothing and extreme value estimation techniques for stochastic processes within considerations of nonstationary expected value of the process. The nonstationarity in variance is evaded by change point analysis which precedes the proposed algorithm. Possible outliers are identified as observations with rare occurrence and, in correspondence to extreme value methodology, the confidence limits for high values of observed variables are constructed. The proposed methodology can be especially convenient for cases where validation of the data has to be carried out manually, since it significantly reduces the number of implausible observations. For a case study, the technique is applied for outlier detection in time series of hourly PM10 concentrations in Brno, Czech Republic. The methodology is derived on solid theoretical results and seems to perform well for the series of PM10. However its flexibility makes it generally applicable not only to series of atmospheric pollutants. On the other hand, the choice of return level turns out to be crucial in sensitivity to the outliers. This issue should be left to the practitioners to decide with respect to specific application conditions.
Název v anglickém jazyce
Semiparametric outlier detection in nonstationary times series: Case study for atmospheric pollution in Brno, Czech Republic
Popis výsledku anglicky
Large environmental datasets usually include outliers which can have significant effects on further analysis and modelling. There exist various outlier detection methods that depend on the distribution of the analysed variable. However quite often the distribution of environmental variables can not be estimated. This paper presents an approach for identification of outliers in environmental time series which does not impose restrictions on the distribution of observed variables. The suggested algorithm combines kernel smoothing and extreme value estimation techniques for stochastic processes within considerations of nonstationary expected value of the process. The nonstationarity in variance is evaded by change point analysis which precedes the proposed algorithm. Possible outliers are identified as observations with rare occurrence and, in correspondence to extreme value methodology, the confidence limits for high values of observed variables are constructed. The proposed methodology can be especially convenient for cases where validation of the data has to be carried out manually, since it significantly reduces the number of implausible observations. For a case study, the technique is applied for outlier detection in time series of hourly PM10 concentrations in Brno, Czech Republic. The methodology is derived on solid theoretical results and seems to perform well for the series of PM10. However its flexibility makes it generally applicable not only to series of atmospheric pollutants. On the other hand, the choice of return level turns out to be crucial in sensitivity to the outliers. This issue should be left to the practitioners to decide with respect to specific application conditions.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10103 - Statistics and probability
Návaznosti výsledku
Projekt
<a href="/cs/project/LO1408" target="_blank" >LO1408: AdMaS UP - Pokročilé stavební materiály, konstrukce a technologie</a><br>
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2018
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Atmospheric Pollution Research
ISSN
1309-1042
e-ISSN
—
Svazek periodika
9
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
TR - Turecká republika
Počet stran výsledku
10
Strana od-do
27-36
Kód UT WoS článku
000429175800003
EID výsledku v databázi Scopus
2-s2.0-85020848643