Daily operation of a large-scale experiment is a resource consuming task, particularly from perspectives of routine data quality monitoring. Typically, data comes from different sub-detectors and the global quality of data depends on the combinatorial performance of each of them. In this paper, the problem of identifying channels in which anomalies occurred is considered. We introduce a generic deep learning model and prove that, under reasonable assumptions, the model learns to identify 'channels' which are affected by an anomaly. Such model could be used for data quality manager cross-check and assistance and identifying good channels in anomalous data samples. The main novelty of the method is that the model does not require ground truth labels for each channel, only global flag is used. This effectively distinguishes the model from classical classification methods. Being applied to CMS data collected in the year 2010, this approach proves its ability to decompose anomaly by separate channels.

Azzolini, V., Borisyak, M., Cerminara, G., Derkach, D., Franzoni, G., De Guio, F., et al. (2018). Deep learning for inferring cause of data anomalies. In 18th International Workshop on Advanced Computing and Analysis Techniques in Physics Research, ACAT 2017. IOP Publishing Ltd [10.1088/1742-6596/1085/4/042015].

Deep learning for inferring cause of data anomalies

De Guio F.;
2018

Abstract

Daily operation of a large-scale experiment is a resource consuming task, particularly from perspectives of routine data quality monitoring. Typically, data comes from different sub-detectors and the global quality of data depends on the combinatorial performance of each of them. In this paper, the problem of identifying channels in which anomalies occurred is considered. We introduce a generic deep learning model and prove that, under reasonable assumptions, the model learns to identify 'channels' which are affected by an anomaly. Such model could be used for data quality manager cross-check and assistance and identifying good channels in anomalous data samples. The main novelty of the method is that the model does not require ground truth labels for each channel, only global flag is used. This effectively distinguishes the model from classical classification methods. Being applied to CMS data collected in the year 2010, this approach proves its ability to decompose anomaly by separate channels.
paper
Classification methods; Cross check; Data anomalies; Data quality; Global quality; Ground truth; Large scale experiments; Learning models
English
18th International Workshop on Advanced Computing and Analysis Techniques in Physics Research, ACAT 2017 - 21 August 2017 through 25 August 2017
2017
18th International Workshop on Advanced Computing and Analysis Techniques in Physics Research, ACAT 2017
2018
1085
4
042015
none
Azzolini, V., Borisyak, M., Cerminara, G., Derkach, D., Franzoni, G., De Guio, F., et al. (2018). Deep learning for inferring cause of data anomalies. In 18th International Workshop on Advanced Computing and Analysis Techniques in Physics Research, ACAT 2017. IOP Publishing Ltd [10.1088/1742-6596/1085/4/042015].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/477684
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact