A new classification method is proposed based on the influence matrix (or leverage matrix). The use of the influence matrix is well known in regression analysis, where the diagonal matrix elements (i.e. the leverages) give information on the role of each sample within the regression model. In effect, the leverages are related to the distance of the sample from the hyperellipsoids defining the model space, to the degree of membership to the model, to the reliability of model predictions. Exploiting the leverage properties, the Classification And Influence Matrix Analysis method (CAIMAN) models each class by means of the class dispersion matrix and calculates the leverage of each sample with respect to each class model space. Unlike other classification methods such as LDA, QDA, and UNEQ, CAIMAN does not require multinormality assumptions. It is developed in three different options: (1) D-CAIMAN, which is a discriminant classification method, (2) M-CAIMAN, which is a class modelling method allowing an object to be classified, not classified at all, or assigned to more than one class, (3) A-CAIMAN, which deals with the asymmetric case, i.e. only a reference class needs to be modelled. Performance of the proposed method has been evaluated by means of several classification data sets taken from literature and compared with the most popular classification methods. Final results seem to indicate that CAIMAN performs well and, in most of the analysed cases, better than the other classification methods. © 2005 Elsevier B.V. All rights reserved.
Todeschini, R., Ballabio, D., Consonni, V., Mauri, A., Pavan, M. (2007). CAIMAN (Classification And Influence Matrix Analysis): a new approach to the classification based on leverage-scaled functions. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 87(1), 3-17 [10.1016/j.chemolab.2005.11.001].
CAIMAN (Classification And Influence Matrix Analysis): a new approach to the classification based on leverage-scaled functions
TODESCHINI, ROBERTO;BALLABIO, DAVIDE;CONSONNI, VIVIANA;MAURI, ANDREA;
2007
Abstract
A new classification method is proposed based on the influence matrix (or leverage matrix). The use of the influence matrix is well known in regression analysis, where the diagonal matrix elements (i.e. the leverages) give information on the role of each sample within the regression model. In effect, the leverages are related to the distance of the sample from the hyperellipsoids defining the model space, to the degree of membership to the model, to the reliability of model predictions. Exploiting the leverage properties, the Classification And Influence Matrix Analysis method (CAIMAN) models each class by means of the class dispersion matrix and calculates the leverage of each sample with respect to each class model space. Unlike other classification methods such as LDA, QDA, and UNEQ, CAIMAN does not require multinormality assumptions. It is developed in three different options: (1) D-CAIMAN, which is a discriminant classification method, (2) M-CAIMAN, which is a class modelling method allowing an object to be classified, not classified at all, or assigned to more than one class, (3) A-CAIMAN, which deals with the asymmetric case, i.e. only a reference class needs to be modelled. Performance of the proposed method has been evaluated by means of several classification data sets taken from literature and compared with the most popular classification methods. Final results seem to indicate that CAIMAN performs well and, in most of the analysed cases, better than the other classification methods. © 2005 Elsevier B.V. All rights reserved.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.