Improving Classification of Malware Families using Learning a Distance Metric

The result's identifiers

Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21240%2F21%3A00347213" target="_blank" >RIV/68407700:21240/21:00347213 - isvavai.cz</a>
Result on the web
<a href="https://www.insticc.org/node/TechnicalProgram/icissp/2021/presentationDetails/103263" target="_blank" >https://www.insticc.org/node/TechnicalProgram/icissp/2021/presentationDetails/103263</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.5220/0010326306430652" target="_blank" >10.5220/0010326306430652</a>

Alternative languages

Result language
angličtina
Original language name
Improving Classification of Malware Families using Learning a Distance Metric
Original language description
The objective of malware family classification is to assign a tested sample to the correct malware family. This paper concerns the application of selected state-of-the-art distance metric learning techniques to malware families classification. The goal of distance metric learning algorithms is to find the most appropriate distance metric parameters concerning some optimization criteria. The distance metric learning algorithms considered in our research learn from metadata, mostly contained in the headers of executable files in the PE file format. Several experiments have been conducted on the dataset with 14,000 samples consisting of six prevalent malware families and benign files. The experimental results showed that the average precision and recall of the k-Nearest Neighbors algorithm using the distance learned on training data were improved significantly comparing when the non-learned distance was used. The k-Nearest Neighbors classifier using the Mahalanobis distance metric learned by the Metric Learning for Kernel Regression method achieved average precision and recall, both of 97.04% compared to Random Forest with a 96.44% of average precision and 96.41% of average recall, which achieved the best classification results among the state-of-the-art ML algorithms considered in our experiments.
Czech name
—
Czech description
—

Classification

Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

Project
<a href="/en/project/EF16_019%2F0000765" target="_blank" >EF16_019/0000765: Research Center for Informatics</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach

Others

Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

Article name in the collection
Proceedings of the 7th International Conference on Information Systems Security and Privacy
ISBN
978-989-758-491-6
ISSN
2184-4356
e-ISSN
—
Number of pages
10
Pages from-to
643-652
Publisher name
SciTePress
Place of publication
Madeira
Event location
Vídeň / Virtuální
Event date
Feb 11, 2021
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000664076200068

Similar results(10)

Application of Distance Metric Learning to Automated Malware Detection Improving Classification of Malware Families using a Learned Distance Metric for Low Dimensions Distance Metric Learning using Particle Swarm Optimization to Improve Static Malware Detection

What are you looking for?

Quick search

Smart search

Improving Classification of Malware Families using Learning a Distance Metric

The result's identifiers

Alternative languages

Classification

Result continuities

Others

Data specific for result type

Similar results(10)

What are you looking for?

Quick search

Smart search

Result description

The result's identifiers

The result's identifiers

Alternative languages

Alternative languages

Classification

Classification

Result continuities

Result continuities

Others

Others

Data specific for result type

Data specific for result type

Similar results(10)