Decision-Forest Voting Scheme for Classification Ofrare Classes in Network Intrusion Detection
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F18%3A00328014" target="_blank" >RIV/68407700:21230/18:00328014 - isvavai.cz</a>
Výsledek na webu
<a href="https://ieeexplore.ieee.org/abstract/document/8616560/keywords#keywords" target="_blank" >https://ieeexplore.ieee.org/abstract/document/8616560/keywords#keywords</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/SMC.2018.00563" target="_blank" >10.1109/SMC.2018.00563</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Decision-Forest Voting Scheme for Classification Ofrare Classes in Network Intrusion Detection
Popis výsledku v původním jazyce
In this paper, Bayesian based aggregation of decisiontrees in an ensemble (decision forest) is investigated. The focusis laid on multi-class classification with number of samplessignificantly skewed toward one of the classes. The algorithmleverages out-of-bag datasets to estimate prediction errors ofindividual trees, which are then used in accordance with theBayes rule to refine the decision of the ensemble. The algorithmtakes prevalence of individual classes into account and doesnot require setting of any additional parameters related toclass weights or decision-score thresholds. Evaluation is basedon publicly available datasets as well as on an proprietarydataset comprising network traffic telemetry from hundreds ofenterprise networks with over a million of users overall. The aimis to increase the detection capabilities of an operating malwaredetection system. While we were able to keep precision of thesystem higher than 94%, that is only 6 out of 100 detectionsshown to the network administrator are false alarms, we wereable to achieve increase of approximately 7% in the number ofdetections. The algorithm effectively handles large amounts ofdata, and can be used in conjunction with most of the state-of-the-art algorithms used to train decision forests.
Název v anglickém jazyce
Decision-Forest Voting Scheme for Classification Ofrare Classes in Network Intrusion Detection
Popis výsledku anglicky
In this paper, Bayesian based aggregation of decisiontrees in an ensemble (decision forest) is investigated. The focusis laid on multi-class classification with number of samplessignificantly skewed toward one of the classes. The algorithmleverages out-of-bag datasets to estimate prediction errors ofindividual trees, which are then used in accordance with theBayes rule to refine the decision of the ensemble. The algorithmtakes prevalence of individual classes into account and doesnot require setting of any additional parameters related toclass weights or decision-score thresholds. Evaluation is basedon publicly available datasets as well as on an proprietarydataset comprising network traffic telemetry from hundreds ofenterprise networks with over a million of users overall. The aimis to increase the detection capabilities of an operating malwaredetection system. While we were able to keep precision of thesystem higher than 94%, that is only 6 out of 100 detectionsshown to the network administrator are false alarms, we wereable to achieve increase of approximately 7% in the number ofdetections. The algorithm effectively handles large amounts ofdata, and can be used in conjunction with most of the state-of-the-art algorithms used to train decision forests.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2018
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
ISBN
978-1-5386-6650-0
ISSN
—
e-ISSN
2577-1655
Počet stran výsledku
6
Strana od-do
3325-3330
Název nakladatele
IEEE Computer Society
Místo vydání
USA
Místo konání akce
Miyazaki
Datum konání akce
7. 10. 2018
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—