K-means clustering of honeynet data with unsupervised representation learning
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F25840886%3A_____%2F21%3AN0000036" target="_blank" >RIV/25840886:_____/21:N0000036 - isvavai.cz</a>
Result on the web
<a href="http://ceur-ws.org/Vol-2853/paper48.pdf" target="_blank" >http://ceur-ws.org/Vol-2853/paper48.pdf</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
K-means clustering of honeynet data with unsupervised representation learning
Original language description
Networks connected to the Internet are vulnerable to malicious activity that threaten the stability of work. The types and characteristics of malicious actions are constantly changing, which significantly complicates the fight against them. Attacks on computer networks are subject to constant updates and modifications. Modern intrusion detection systems should ensure the detection of both existing types of attacks and new types of attacks about which there might be no information available at the time of attack. Honeypots and honeynets play an important role in monitoring malicious activities and detecting new types of attacks. The use of honeypots and honeynets has significant advantages: they can protect working services, provide network vulnerability detection, reduce the false positive rate, slow down the influence of malicious actions on the working network, and collect data on malicious activity. The analysis of the data collected by a honeynet helps detect attack patterns that can be used in intrusion detection systems. This paper uses clustering to determine attack patterns based on the time series of attacker activity. Using time series instead of static data facilitates the detection of attacks at their onset. This paper proposes the joint application of k-means clustering and a recurrent autoencoder for time series preprocessing. The weights of the recurrent autoencoder are optimized on the basis of the total loss function, which contains two components: a recovery loss component and a clustering loss component. The recurrent encoder, consisting of convolutional and recurrent blocks, provides an effective time series representation, suitable for finding similar patterns using k-means clustering. Experimental research shows that the proposed approach clusters malicious actions monitored by a honeynet and identifies patterns of attacks.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10200 - Computer and information sciences
Result continuities
Project
—
Continuities
N - Vyzkumna aktivita podporovana z neverejnych zdroju
Others
Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
CEUR Workshop Proceedings
ISBN
—
ISSN
1613-0073
e-ISSN
—
Number of pages
11
Pages from-to
439 - 449
Publisher name
CEUR-WS
Place of publication
CEUR-WS
Event location
Khmelnytskyi
Event date
Mar 24, 2021
Type of event by nationality
EUR - Evropská akce
UT code for WoS article
—