Loda: Lightweight on-line detector of anomalies

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F16%3A00237783" target="_blank" >RIV/68407700:21230/16:00237783 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1007/s10994-015-5521-0" target="_blank" >http://dx.doi.org/10.1007/s10994-015-5521-0</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10994-015-5521-0" target="_blank" >10.1007/s10994-015-5521-0</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Loda: Lightweight on-line detector of anomalies
Popis výsledku v původním jazyce
In supervised learning it has been shown that a collection of weak classifiers can result in a strong classifier with error rates similar to those of more sophisticated methods. In unsupervised learning, namely in anomaly detection such a paradigm has not yet been demonstrated despite the fact that many methods have been devised as counterparts to supervised binary classifiers. This work partially fills the gap by showing that an ensemble of very weak detectors can lead to a strong anomaly detector with a performance equal to or better than state of the art methods. The simplicity of the proposed ensemble system (to be called Loda) is particularly useful in domains where a large number of samples need to be processed in real-time or in domains where the data stream is subject to concept drift and the detector needs to be updated on-line. Besides being fast and accurate, Loda is also able to operate and update itself on data with missing variables. Loda is thus practical in domains with sensor outages. Moreover, Loda can identify features in which the scrutinized sample deviates from the majority. This capability is useful when the goal is to find out what has caused the anomaly. It should be noted that none of these favorable properties increase Loda’s low time and space complexity. We compare Loda to several state of the art anomaly detectors in two settings: batch training and on-line training on data streams. The results on 36 datasets from UCI repository illustrate the strengths of the proposed system, but also provide more insight into the more general questions regarding batch-vs-on-line anomaly detection.
Název v anglickém jazyce
Loda: Lightweight on-line detector of anomalies
Popis výsledku anglicky
In supervised learning it has been shown that a collection of weak classifiers can result in a strong classifier with error rates similar to those of more sophisticated methods. In unsupervised learning, namely in anomaly detection such a paradigm has not yet been demonstrated despite the fact that many methods have been devised as counterparts to supervised binary classifiers. This work partially fills the gap by showing that an ensemble of very weak detectors can lead to a strong anomaly detector with a performance equal to or better than state of the art methods. The simplicity of the proposed ensemble system (to be called Loda) is particularly useful in domains where a large number of samples need to be processed in real-time or in domains where the data stream is subject to concept drift and the detector needs to be updated on-line. Besides being fast and accurate, Loda is also able to operate and update itself on data with missing variables. Loda is thus practical in domains with sensor outages. Moreover, Loda can identify features in which the scrutinized sample deviates from the majority. This capability is useful when the goal is to find out what has caused the anomaly. It should be noted that none of these favorable properties increase Loda’s low time and space complexity. We compare Loda to several state of the art anomaly detectors in two settings: batch training and on-line training on data streams. The results on 36 datasets from UCI repository illustrate the strengths of the proposed system, but also provide more insight into the more general questions regarding batch-vs-on-line anomaly detection.

Klasifikace

Druh
J<sub>x</sub> - Nezařazeno - Článek v odborném periodiku (Jimp, Jsc a Jost)
CEP obor
JD - Využití počítačů, robotika a její aplikace
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/GPP103%2F12%2FP514" target="_blank" >GPP103/12/P514: Detekce anomalií v reálném čase a časově nestálem prostředí</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2016
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Machine Learning
ISSN
0885-6125
e-ISSN
—
Svazek periodika
102
Číslo periodika v rámci svazku
2
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
30
Strana od-do
275-304
Kód UT WoS článku
000371460000005
EID výsledku v databázi Scopus
2-s2.0-84955680786

Podobné výsledky(10)

Anomaly detection by bagging Domain Adaptation for Sequential Detection -- {PhD} Thesis Proposal Detektor anomálií v dlouhodobě sledovaných datech

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Loda: Lightweight on-line detector of anomalies

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)