Divergence decision tree classification with Kolmogorov kernel smoothing in high energy physics
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21340%2F21%3A00353093" target="_blank" >RIV/68407700:21340/21:00353093 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1088/1742-6596/1730/1/012060" target="_blank" >https://doi.org/10.1088/1742-6596/1730/1/012060</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1088/1742-6596/1730/1/012060" target="_blank" >10.1088/1742-6596/1730/1/012060</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Divergence decision tree classification with Kolmogorov kernel smoothing in high energy physics
Popis výsledku v původním jazyce
The binary classification of a given dataset is a task of assigning one of the two possible classes to each observation. This can be achieved by many machine learning techniques, e.g. logistic regression, decision trees, neural networks. The supervised divergence decision tree (SDDT) is our own binary classification algorithm in favour of the Rényi divergence, which incorporates multi-dimensional kernel density estimates (KDEs) as the main part of the splitting process in its tree nodes. However, the KDE needs an efficient smoothing in order to obtain quite satisfactory classification results. In this paper, the D-discrepancy method for selecting the bandwidth was applied. It is based on an evaluation of divergences, or distances, between two estimated distributions. The Kolmogorov metric distance on probability space is used and the performance of such a novel technique is compared to standard smoothing techniques. The final goal is to perform a binary classification and achieve the best possible results with respect to the AUC value (area under ROC curve) on a given high energy physics (HEP) dataset, specifically for d+Au heavy ions decay data. This HEP dataset is described and the main structure of the used SDDT is outlined. Final classification results are presented for KDE under Kolmogorov D-method of smoothing in SDDT algorithm.
Název v anglickém jazyce
Divergence decision tree classification with Kolmogorov kernel smoothing in high energy physics
Popis výsledku anglicky
The binary classification of a given dataset is a task of assigning one of the two possible classes to each observation. This can be achieved by many machine learning techniques, e.g. logistic regression, decision trees, neural networks. The supervised divergence decision tree (SDDT) is our own binary classification algorithm in favour of the Rényi divergence, which incorporates multi-dimensional kernel density estimates (KDEs) as the main part of the splitting process in its tree nodes. However, the KDE needs an efficient smoothing in order to obtain quite satisfactory classification results. In this paper, the D-discrepancy method for selecting the bandwidth was applied. It is based on an evaluation of divergences, or distances, between two estimated distributions. The Kolmogorov metric distance on probability space is used and the performance of such a novel technique is compared to standard smoothing techniques. The final goal is to perform a binary classification and achieve the best possible results with respect to the AUC value (area under ROC curve) on a given high energy physics (HEP) dataset, specifically for d+Au heavy ions decay data. This HEP dataset is described and the main structure of the used SDDT is outlined. Final classification results are presented for KDE under Kolmogorov D-method of smoothing in SDDT algorithm.
Klasifikace
Druh
J<sub>ost</sub> - Ostatní články v recenzovaných periodicích
CEP obor
—
OECD FORD obor
10103 - Statistics and probability
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Journal of Physics Conference Series
ISSN
1742-6588
e-ISSN
—
Svazek periodika
1730
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
GB - Spojené království Velké Británie a Severního Irska
Počet stran výsledku
6
Strana od-do
—
Kód UT WoS článku
—
EID výsledku v databázi Scopus
2-s2.0-85101557186