Visual Anomaly Detection For Automatic Quality Control

Piccoli, F

Automatic quality control is one of the key ingredients for the fourth industrial revolution that will lead to the development of the so called industry 4.0. In this context, a crucial element is a production-compatible-time detection of defects, anomalies or product failures. This thesis focuses exactly on this theme: anomaly detection for industrial quality inspection, ensured through the analysis of images depicting the product under inspection. This analysis will be done through the use of machine learning, and especially through the use of convolutional neural networks (CNNs), a powerful instrument used in image analysis. This thesis starts with an extensive study on the subject to introduce the reader and to propose a pipeline for automatic anomaly detection. This pipeline is composed by two steps: 1) the enhancement of the input images for highlighting defects; 2) the detection of the anomalies. The first step is addressed with the use of a global color transformation able to remove undesired light effects and to enhance the contrast. This transformation is inferred through the use of SpliNet, a new CNN-based method here presented, that is able to enhance the input images by inferring the parameters of a set of splines. In the context of anomaly detection, two methods are presented. The first one has the aim of modeling normality by learning a dictionary and using it in test time to determine the degree of abnormality of an inquiry image. This method is based on deep learning, which is known to be data-hungry. However, the proposed algorithm is able to work also on very small trainsets (in the order of five images). The presented method boosts the performances of 5% with respect to the state-of-the art for the SEM-acquired nanofibers dataset, achieving an area under curve of 97.4%. The second proposed algorithm is a generative method able to restore the input, creating an anomaly-free version of the inquiry image. This method uses a set of local transforms to restore the input images. Specifically, these transforms are sets of polynomials of degree two, whose parameters are determined through the use of a convolutional neural network. In this context, the method can be tuned with a parameter toward accuracy or speed, for matching the needs of the final user. To address the lack of data that is suffered in this field, a totally new method for data augmentation based on deep learning is presented. This method is able to generate thousands of new synthesized samples starting from a few and thus is particularly suitable for augmenting long-tail datasets. The quality of the synthesized samples is demonstrated by showing the increase in performance of machine learning algorithms trained on the augmented dataset. This method has been employed to enlarge a dataset of defective asphalts. In this context, the use of the augmented dataset permitted to increase the average performance on anomaly segmentation of up to 17.5 percentage points. In the case of classes having a low cardinality, the improvement is up to 54.5 percentage points. For all the methods here presented I show their effectiveness by analyzing the results with the respective state-of-the-art and show their ability in outperforming the existing methods.

Il controllo di qualità automatico nei processi di produzione è uno degli elementi chiave della quarta rivoluzione industriale che porterà alla creazione della cosiddetta industria 4.0. In questo contesto, un elemento fondamentale è il rilievo di difetti, anomalie o guasti del prodotto in tempo compatibile con quello di produzione. Questa tesi si focalizza esattamente su questo tema: il rilevamento delle anomalie per il controllo automatico di qualità, attraverso l'analisi di immagini raffiguranti il prodotto sotto ispezione. Questa analisi verrà fatta tramite l'utilizzo di tecniche di machine learning, in particolare tramite l'uso di reti neurali convoluzionali (CNN) che sono uno strumento molto potente utilizzato nell'analisi di immagini. In primo luogo questa tesi esegue uno studio estensivo sull'argomento per introdurre il lettore e propone una una sequenza di elaborazioni per il rilevamento automatico di anomalie. Le elaborazioni sono: 1) il miglioramento delle immagini per evidenziare i difetti; 2) il rilievo delle anomalie. La prima elaborazione viene risolta utilizzando una trasformazione colore globale in grado di rimuovere effetti di luce indesiderati ed aumentare il contrasto. Questa trasformazione è ottenuta grazie all'utilizzo di SpliNet, un metodo basato su CNN che viene presentato in questa tesi, che è in grado di migliorare le immagini di input inferendo i parametri di un insieme di spline. La seconda elaborazione, e cioè il rilievo di anomalie, è stata affrontata proponendo due diversi metodi. Il primo ha l'obiettivo di modellare la normalità imparando un dizionario ed utilizzandolo in fase di test per determinare il grado di abnormalità di una immagine incognita. Questo metodo è basato su CNNs, che notoriamente richiedono grandi quantità di dati per essere addestrate. Tuttavia l'algoritmo proposto è in grado di lavorare su un insieme di immagini di addestramento molto piccolo (nell'ordine delle cinque immagini). Il metodo presentato aumenta le performances rispetto allo stato dell'arte relativo al dataset delle nanofibre acquisite con microscopio SEM del 5%, ottenendo un'area sottesa alla curva di 97.4%. Il secondo metodo proposto usa un insieme di trasformazioni locali per restaurare le immagini di input. Specificamente, queste trasformazioni sono un insieme di polinomi di grado due, i cui parametri vengono determinati attraverso l'utilizzo di una rete neurale convoluzionale. Il metodo è progettato in maniera tale che è possibile, attraverso un parametro, modulare l'accuratezza e il tempo di calcolo in maniera tale da soddisfare le esigenze dell'utente finale. Per affrontare la mancanza di dati che affligge il campo del rilievo automatico di anomalie, è stato presentato un metodo innovativo di aumento dei dati basato su deep learning. Questo metodo è in grado di generare migliaia di nuovi campioni sintetici a partire da pochi dati reali e pertanto è particolarmente adatto per aumentare dataset di tipo long-tail. La qualità dei campioni sintetizzati è stata dimostrata misurando l'accuratezza delle performance di algoritmi di machine learning addestrati sul dataset aumentato. Questo metodo è stato utilizzato per espandere un dataset di immagini raffiguranti asfalti difettosi. In questo contesto, l'utilizzo del dataset aumentato ha portato ad un incremento delle performance medie sulla segmentazione di anomalie fino a 17.5 punti percentuali. Nel caso di classi aventi bassa cardinalità, l'aumento arriva fino a 54.5 punti percentuali. Tutti i metodi presentati sovraperformano lo stato dell'arte.

(2019). Visual Anomaly Detection For Automatic Quality Control. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2019).