Interpretability of Machine Learning-Based Results of Malware Detection Using a Set of Rules

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21240%2F22%3A00358963" target="_blank" >RIV/68407700:21240/22:00358963 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/chapter/10.1007/978-3-030-97087-1_5" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-030-97087-1_5</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-030-97087-1_5" target="_blank" >10.1007/978-3-030-97087-1_5</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Interpretability of Machine Learning-Based Results of Malware Detection Using a Set of Rules
Popis výsledku v původním jazyce
Machine learning plays an indispensable role in modern malware detection; it provides malware researchers with quick and reliable results. On the other hand, the results can be hard to understand as to why a model classified a given file as malicious or benign. This paper focuses on the interpretability of machine learning models’ results using decision lists generated by two rule-based classifiers, I-REP and RIPPER. We use the EMBER dataset, which contains features extracted through static analysis from Portable Executable files, to train various machine learning models. We extract decision lists from the machine learning models’ results using our implementation of I-REP and RIPPER. By taking into account accuracies, true positive and false positive rates of the decision lists, we reason whether the generated decision lists make a good representation of the results. To comprehend the interpretability of the machine learning models, we define Human Most Understandable Model and Interpretability Entropy. This allows us to measure and compare the interpretability among the models. The most interpretable machine learning model by RIPPER was Gaussian Naïve Bayes. Results show that RIPPER is relatively successful at interpreting other machine learning models; however, it needs some improvements to increase true positive rate.
Název v anglickém jazyce
Interpretability of Machine Learning-Based Results of Malware Detection Using a Set of Rules
Popis výsledku anglicky
Machine learning plays an indispensable role in modern malware detection; it provides malware researchers with quick and reliable results. On the other hand, the results can be hard to understand as to why a model classified a given file as malicious or benign. This paper focuses on the interpretability of machine learning models’ results using decision lists generated by two rule-based classifiers, I-REP and RIPPER. We use the EMBER dataset, which contains features extracted through static analysis from Portable Executable files, to train various machine learning models. We extract decision lists from the machine learning models’ results using our implementation of I-REP and RIPPER. By taking into account accuracies, true positive and false positive rates of the decision lists, we reason whether the generated decision lists make a good representation of the results. To comprehend the interpretability of the machine learning models, we define Human Most Understandable Model and Interpretability Entropy. This allows us to measure and compare the interpretability among the models. The most interpretable machine learning model by RIPPER was Gaussian Naïve Bayes. Results show that RIPPER is relatively successful at interpreting other machine learning models; however, it needs some improvements to increase true positive rate.

Klasifikace

Druh
C - Kapitola v odborné knize
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
<a href="/cs/project/EF16_019%2F0000765" target="_blank" >EF16_019/0000765: Výzkumné centrum informatiky</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název knihy nebo sborníku
Cybersecurity for Artificial Intelligence
ISBN
978-3-030-97086-4
Počet stran výsledku
30
Strana od-do
107-136
Počet stran knihy
380
Název nakladatele
Springer, Cham
Místo vydání
—
Kód UT WoS kapitoly
—

Podobné výsledky(10)

Generation of Adversarial Malware and Benign Examples Using Reinforcement Learning Application of Distance Metric Learning to Automated Malware Detection Stealing and evading malware classifiers and antivirus at low false positive conditions

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Interpretability of Machine Learning-Based Results of Malware Detection Using a Set of Rules

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)