Development and comparison of circulation type classifications using the COST 733 dataset and software
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68378289%3A_____%2F16%3A00432101" target="_blank" >RIV/68378289:_____/16:00432101 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/00216208:11310/16:10328679
Výsledek na webu
<a href="http://dx.doi.org/10.1002/joc.3920" target="_blank" >http://dx.doi.org/10.1002/joc.3920</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1002/joc.3920" target="_blank" >10.1002/joc.3920</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Development and comparison of circulation type classifications using the COST 733 dataset and software
Popis výsledku v původním jazyce
In order to examine correspondence between different methods for circulation type classification, a dataset of classification catalogs for 12 different European regions has been created using a specially developed software package. Twenty-seven basic automatic classification methods have been applied in several variants to different input datasets describing atmospheric circulation. Together with six manual classifications a total of 33 methods are available for inter-comparison. Pattern correlation, frequency time-series correlation and the adjusted Rand index have been used for comparison. Highly significant correspondence has been detected only for two clustering techniques while the remaining classification methods show surprisingly low similarity. A Monte-Carlo test with 1000 classifications of randomly defined types even shows that most of the methods are not more similar among each other than any arbitrarily chosen types. The predominant dissimilarity between the methods is interpreted to be a result of a lack of inherent structures of the input data. Only simulated annealing clustering and self-organizing maps get nearly identical results because they can optimally fit the partitioning to the outer shape of the data cloud in the phase space. Also methods based on pre-defined types come to very different results because small changes in the definition of thresholds may lead to large differences in the partitioning. It is concluded that because of the missing inner structure of the data there is no clear statistical reason to prefer any of the examined methods. For practice in synoptic climatology this means that finding a suited classification for a certain purpose may require a broad comparison of methods. The software package cost733class for development, comparison and evaluation of classifications which was developed and used in this study is available at to facilitate this task.
Název v anglickém jazyce
Development and comparison of circulation type classifications using the COST 733 dataset and software
Popis výsledku anglicky
In order to examine correspondence between different methods for circulation type classification, a dataset of classification catalogs for 12 different European regions has been created using a specially developed software package. Twenty-seven basic automatic classification methods have been applied in several variants to different input datasets describing atmospheric circulation. Together with six manual classifications a total of 33 methods are available for inter-comparison. Pattern correlation, frequency time-series correlation and the adjusted Rand index have been used for comparison. Highly significant correspondence has been detected only for two clustering techniques while the remaining classification methods show surprisingly low similarity. A Monte-Carlo test with 1000 classifications of randomly defined types even shows that most of the methods are not more similar among each other than any arbitrarily chosen types. The predominant dissimilarity between the methods is interpreted to be a result of a lack of inherent structures of the input data. Only simulated annealing clustering and self-organizing maps get nearly identical results because they can optimally fit the partitioning to the outer shape of the data cloud in the phase space. Also methods based on pre-defined types come to very different results because small changes in the definition of thresholds may lead to large differences in the partitioning. It is concluded that because of the missing inner structure of the data there is no clear statistical reason to prefer any of the examined methods. For practice in synoptic climatology this means that finding a suited classification for a certain purpose may require a broad comparison of methods. The software package cost733class for development, comparison and evaluation of classifications which was developed and used in this study is available at to facilitate this task.
Klasifikace
Druh
J<sub>x</sub> - Nezařazeno - Článek v odborném periodiku (Jimp, Jsc a Jost)
CEP obor
DG - Vědy o atmosféře, meteorologie
OECD FORD obor
—
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2016
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
International Journal of Climatology
ISSN
0899-8418
e-ISSN
—
Svazek periodika
36
Číslo periodika v rámci svazku
7
Stát vydavatele periodika
GB - Spojené království Velké Británie a Severního Irska
Počet stran výsledku
19
Strana od-do
2673-2691
Kód UT WoS článku
000377276300002
EID výsledku v databázi Scopus
2-s2.0-84895141784