Optimized Invariant Representation of Network Traffic for Detecting Unseen Malware Variants
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F16%3A00307384" target="_blank" >RIV/68407700:21230/16:00307384 - isvavai.cz</a>
Result on the web
<a href="https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_bartos.pdf" target="_blank" >https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_bartos.pdf</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Optimized Invariant Representation of Network Traffic for Detecting Unseen Malware Variants
Original language description
New and unseen polymorphic malware, zero-day attacks, or other types of advanced persistent threats are usually not detected by signature-based security devices, firewalls, or anti-viruses. This represents a challenge to the network security industry as the amount and variability of incidents has been increasing. Consequently, this complicates the design of learning-based detection systems relying on features extracted from network data. The problem is caused by different joint distribution of observation (features) and labels in the training and testing data sets. This paper proposes a classification system designed to detect both known as well as previously-unseen security threats. The classifiers use statistical feature representation computed from the network traffic and learn to recognize malicious behavior. The representation is designed and optimized to be invariant to the most common changes of malware behaviors. This is achieved in part by a feature histogram constructed for each group of HTTP flows (proxy log records) of a user visiting a particular hostname and in part by a feature self-similarity matrix computed for each group. The parameters of the representation (histogram bins) are optimized and learned based on the training samples along with the classifiers. The proposed classification system was deployed on large corporate networks, where it detected 2,090 new and unseen variants of malware samples with 90% precision (9 of 10 alerts were malicious), which is a considerable improvement when compared to the current flow-based approaches or existing signature-based web security devices.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
JC - Computer hardware and software
OECD FORD branch
—
Result continuities
Project
—
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2016
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the 25th USENIX Security Symposium
ISBN
978-1-931971-32-4
ISSN
—
e-ISSN
—
Number of pages
16
Pages from-to
807-822
Publisher name
The USENIX Association
Place of publication
—
Event location
Austin, Texas
Event date
Aug 10, 2016
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000385263000048