This paper presents Prevent, a fully unsupervised approach to predict and localize failures in distributed enterprise applications. Software failures in production are unavoidable. Predicting failures and locating failing components online are the first steps to proactively manage faults in production. Many techniques predict failures from anomalous combinations of system metrics with supervised, weakly supervised, and semi-supervised learning models. Supervised approaches require large sets of labelled data not commonly available in large enterprise applications, and address failure types that can be either captured with predefined rules or observed while training supervised models. Prevent integrates the core ingredients of unsupervised approaches into a novel fully unsupervised approach to predict failures and localize failing resources. The results of experimenting with Prevent on a commercially-compliant distributed cloud system indicate that Prevent provides more stable, reliable and timely predictions than supervised learning approaches, without requiring the often impractical training with labeled data.

Denaro, G., Heydarov, R., Mohebbi, A., Pezzè, M. (2023). Prevent: An Unsupervised Approach to Predict Software Failures in Production. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 49(12), 5139-5153 [10.1109/TSE.2023.3327583].

Prevent: An Unsupervised Approach to Predict Software Failures in Production

Denaro G.;Pezzè M.
2023

Abstract

This paper presents Prevent, a fully unsupervised approach to predict and localize failures in distributed enterprise applications. Software failures in production are unavoidable. Predicting failures and locating failing components online are the first steps to proactively manage faults in production. Many techniques predict failures from anomalous combinations of system metrics with supervised, weakly supervised, and semi-supervised learning models. Supervised approaches require large sets of labelled data not commonly available in large enterprise applications, and address failure types that can be either captured with predefined rules or observed while training supervised models. Prevent integrates the core ingredients of unsupervised approaches into a novel fully unsupervised approach to predict failures and localize failing resources. The results of experimenting with Prevent on a commercially-compliant distributed cloud system indicate that Prevent provides more stable, reliable and timely predictions than supervised learning approaches, without requiring the often impractical training with labeled data.
Articolo in rivista - Articolo scientifico
distributed applications; Failure prediction; machine learning;
English
2-nov-2023
2023
49
12
5139
5153
3327583
open
Denaro, G., Heydarov, R., Mohebbi, A., Pezzè, M. (2023). Prevent: An Unsupervised Approach to Predict Software Failures in Production. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 49(12), 5139-5153 [10.1109/TSE.2023.3327583].
File in questo prodotto:
File Dimensione Formato  
10281-462318_VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 835.97 kB
Formato Adobe PDF
835.97 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/462318
Citazioni
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 1
Social impact