The aim of this thesis is to study two main categories of algorithms for object detection and their use in particular applications. The first category that is investigated concerns Keypoint-based approaches. Several comparative experiments are performed within the standard testing pipeline of the MPEG CDVS Test Model and an extended pipeline which make use of color information is proposed. The second category of object detectors that is investigated is based on Convolutional Neural Networks. Two applications of Convolutional Neural Networks for object recognition are in particular addressed. The first concerns logo recognition. Two classification pipelines are designed and tested on a real-world dataset of images collected from Flickr. The first architecture makes use of a pre-trained network as feature extractor and it achieves comparable results keypoint based approaches. The second architecture makes use of a tiny end-to-end trained Neural Network that outperformed state-of-the-art keypoint based methods. The other application addressed is Painting Categorization. It consists in associating the author, assigning a painting to the school or art movement it belongs to, and categorizing the genre of the painting, e.g. landscape, portrait, illustration etc. To tackle this problem, a novel multibranch and multitask Neural Network structure is proposed which benefit from joint use of keypoint-based approaches and neural features. In both applications the use of data augmentation techniques to enlarge the training set is also investigated. In particular for paintings, a neural style transfer algorithm is exploited for generating synthetic paintings to be used in training.

Lo scopo di questa tesi è di studiare due principali categorie di algoritmi per la detection di oggetti e il loro uso in particolari applicazioni. La prima categoria esaminata riguarda approcci basati su Keypoint. Diversi esperimenti comparativi vengono eseguiti all'interno della pipeline standard del modello di test MPEG CDVS e viene proposta una pipeline estesa che fa uso di informazione colore. La seconda categoria di object detectors oggetto di indagine si basa su Reti neurali convoluzionali. In particolare, vengono affrontate due applicazioni di reti neurali convoluzionali per il riconoscimento di oggetti. Il primo riguarda il riconoscimento di loghi commerciali. Due pipeline di classificazione sono progettate e testate su un set di immagini raccolte da Flickr. La prima architettura utilizza una rete pre-addestrata come feature extractor e raggiunge risultati comparabili a quelli di algoritmi basati Keypoint. La seconda architettura si avvale di una rete neurale che supera le performances di metodi stato dell'arte basati su Keypoint. L'altra applicazione esaminata è la categorizzazione di dipinti che consiste nell'associare l'autore, nell'assegnare un dipinto alla scuola o al movimento artistico a cui appartiene, e classificare il genere del dipinto, ad es. paesaggio, ritratto, illustrazione ecc. Per affrontare questo problema, viene proposta una struttura di rete neurale multibranch e multitask che beneficia dell'uso congiunto di approcci basati su keypoint e di features neurali. In entrambe le applicazioni viene anche esaminato l'uso di tecniche di data augmentation per ampliare il training set. In particolare per i dipinti, un algoritmo di trasferimento di stile pittorico basato su reti neurali viene sfruttato per generare quadri sintetici da utilizzare in fase di training.

(2018). Local Detectors and Descriptors for Object and Scene Recognition. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2018).

Local Detectors and Descriptors for Object and Scene Recognition

MAZZINI, DAVIDE
2018

Abstract

The aim of this thesis is to study two main categories of algorithms for object detection and their use in particular applications. The first category that is investigated concerns Keypoint-based approaches. Several comparative experiments are performed within the standard testing pipeline of the MPEG CDVS Test Model and an extended pipeline which make use of color information is proposed. The second category of object detectors that is investigated is based on Convolutional Neural Networks. Two applications of Convolutional Neural Networks for object recognition are in particular addressed. The first concerns logo recognition. Two classification pipelines are designed and tested on a real-world dataset of images collected from Flickr. The first architecture makes use of a pre-trained network as feature extractor and it achieves comparable results keypoint based approaches. The second architecture makes use of a tiny end-to-end trained Neural Network that outperformed state-of-the-art keypoint based methods. The other application addressed is Painting Categorization. It consists in associating the author, assigning a painting to the school or art movement it belongs to, and categorizing the genre of the painting, e.g. landscape, portrait, illustration etc. To tackle this problem, a novel multibranch and multitask Neural Network structure is proposed which benefit from joint use of keypoint-based approaches and neural features. In both applications the use of data augmentation techniques to enlarge the training set is also investigated. In particular for paintings, a neural style transfer algorithm is exploited for generating synthetic paintings to be used in training.
SCHETTINI, RAIMONDO
MAURINO, ANDREA
BIANCO, SIMONE
Keypoint-based; Detectors; Descriptors; Deep; Networks
Keypoint-based; Detectors; Descriptors; Deep; Networks
INF/01 - INFORMATICA
English
8-mar-2018
INFORMATICA - 87R
30
2016/2017
open
(2018). Local Detectors and Descriptors for Object and Scene Recognition. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2018).
File in questo prodotto:
File Dimensione Formato  
phd_unimib_709737.pdf

accesso aperto

Descrizione: tesi di dottorato
Tipologia di allegato: Doctoral thesis
Dimensione 6.02 MB
Formato Adobe PDF
6.02 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/199003
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact