Multiple instance learning for malware classification
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F18%3A00315376" target="_blank" >RIV/68407700:21230/18:00315376 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.sciencedirect.com/science/article/pii/S0957417417307170?via%3Dihub" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0957417417307170?via%3Dihub</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.eswa.2017.10.036" target="_blank" >10.1016/j.eswa.2017.10.036</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Multiple instance learning for malware classification
Popis výsledku v původním jazyce
This work addresses classification of unknown binaries executed in sandbox by modeling their interaction with system resources (files, mutexes, registry keys and communication with servers over the network) and error messages provided by the operating system, using vocabulary-based method from the multiple instance learning paradigm. It introduces similarities suitable for individual resource types that combined with an approximative clustering method efficiently group the system resources and define features directly from data. This approach effectively removes randomization often employed by malware authors and projects samples into low-dimensional feature space suitable for common classifiers. An extensive comparison to the state of the art on a large corpus of binaries demonstrates that the proposed solution achieves superior results using only a fraction of training samples. Moreover, it makes use of a source of information different than most of the prior art, which increases the diversity of tools detecting the malware, hence making detection evasion more difficult.
Název v anglickém jazyce
Multiple instance learning for malware classification
Popis výsledku anglicky
This work addresses classification of unknown binaries executed in sandbox by modeling their interaction with system resources (files, mutexes, registry keys and communication with servers over the network) and error messages provided by the operating system, using vocabulary-based method from the multiple instance learning paradigm. It introduces similarities suitable for individual resource types that combined with an approximative clustering method efficiently group the system resources and define features directly from data. This approach effectively removes randomization often employed by malware authors and projects samples into low-dimensional feature space suitable for common classifiers. An extensive comparison to the state of the art on a large corpus of binaries demonstrates that the proposed solution achieves superior results using only a fraction of training samples. Moreover, it makes use of a source of information different than most of the prior art, which increases the diversity of tools detecting the malware, hence making detection evasion more difficult.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2018
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Expert Systems with Applications
ISSN
0957-4174
e-ISSN
1873-6793
Svazek periodika
2018
Číslo periodika v rámci svazku
93
Stát vydavatele periodika
NL - Nizozemsko
Počet stran výsledku
12
Strana od-do
346-357
Kód UT WoS článku
000416498300028
EID výsledku v databázi Scopus
2-s2.0-85032009982