The current understanding of deep neural networks can only partially explain how input structure, network parameters and optimization algorithms jointly contribute to achieve the strong generalization power that is typically observed in many real-world applications. In order to improve the comprehension and interpretability of deep neural networks, we here introduce a novel theoretical framework based on the compositional structure of piecewise linear activation functions. By defining a direct acyclic graph representing the composition of activation patterns through the network layers, it is possible to characterize the in-stances of the input data with respect to both the predicted label and the specific (linear) transformation used to perform predictions. Preliminary tests on the MNIST dataset show that our method can group input instances with regard to their similarity in the internal representation of the neural network, providing an intuitive measure of input complexity.
Craigher, F., Angaroni, F., Graudenzi, A., Stella, F., Antoniotti, M. (2020). Investigating the Compositional Structure Of Deep Neural Networks. In The Sixth International Conference on Machine Learning, Optimization, and Data Science (pp.322-334) [10.1007/978-3-030-64583-0_30].
Investigating the Compositional Structure Of Deep Neural Networks
Craigher, FPrimo
;Angaroni, F;Graudenzi, A
;Stella, F;Antoniotti, MUltimo
2020
Abstract
The current understanding of deep neural networks can only partially explain how input structure, network parameters and optimization algorithms jointly contribute to achieve the strong generalization power that is typically observed in many real-world applications. In order to improve the comprehension and interpretability of deep neural networks, we here introduce a novel theoretical framework based on the compositional structure of piecewise linear activation functions. By defining a direct acyclic graph representing the composition of activation patterns through the network layers, it is possible to characterize the in-stances of the input data with respect to both the predicted label and the specific (linear) transformation used to perform predictions. Preliminary tests on the MNIST dataset show that our method can group input instances with regard to their similarity in the internal representation of the neural network, providing an intuitive measure of input complexity.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.