Deep learning for inferring cause of data anomalies
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F18%3A00106847" target="_blank" >RIV/00216224:14330/18:00106847 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1088/1742-6596/1085/4/042015" target="_blank" >http://dx.doi.org/10.1088/1742-6596/1085/4/042015</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1088/1742-6596/1085/4/042015" target="_blank" >10.1088/1742-6596/1085/4/042015</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Deep learning for inferring cause of data anomalies
Popis výsledku v původním jazyce
Daily operation of a large-scale experiment is a resource consuming task, particularly from perspectives of routine data quality monitoring. Typically, data comes from different sub-detectors and the global quality of data depends on the combinatorial performance of each of them. In this paper, the problem of identifying channels in which anomalies occurred is considered. We introduce a generic deep learning model and prove that, under reasonable assumptions, the model learns to identify ’channels’ which are affected by an anomaly. Such model could be used for data quality manager cross-check and assistance and identifying good channels in anomalous data samples. The main novelty of the method is that the model does not require ground truth labels for each channel, only global flag is used. This effectively distinguishes the model from classical classification methods. Being applied to CMS data collected in the year 2010, this approach proves its ability to decompose anomaly by separate channels.
Název v anglickém jazyce
Deep learning for inferring cause of data anomalies
Popis výsledku anglicky
Daily operation of a large-scale experiment is a resource consuming task, particularly from perspectives of routine data quality monitoring. Typically, data comes from different sub-detectors and the global quality of data depends on the combinatorial performance of each of them. In this paper, the problem of identifying channels in which anomalies occurred is considered. We introduce a generic deep learning model and prove that, under reasonable assumptions, the model learns to identify ’channels’ which are affected by an anomaly. Such model could be used for data quality manager cross-check and assistance and identifying good channels in anomalous data samples. The main novelty of the method is that the model does not require ground truth labels for each channel, only global flag is used. This effectively distinguishes the model from classical classification methods. Being applied to CMS data collected in the year 2010, this approach proves its ability to decompose anomaly by separate channels.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2018
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Journal of Physics: Conference Series Volume 1085, Issue 4, 18th International Workshop on Advanced Computing and Analysis Techniques in Physics Research, ACAT 2017
ISBN
—
ISSN
1742-6588
e-ISSN
—
Počet stran výsledku
6
Strana od-do
1-6
Název nakladatele
Institute of Physics Publishing
Místo vydání
Seattle
Místo konání akce
Seattle
Datum konání akce
1. 1. 2018
Typ akce podle státní příslušnosti
CST - Celostátní akce
Kód UT WoS článku
000467872200081