The thesis is about the Forward Search, an approach to robust data analysis proposed and developed in the last 15 years mainly by Atkinson, Riani and Cerioli. It is a general method for detecting unidentified subsets and masked outliers in complex data and for determining their effect on models fitted to the data. This thesis approaches the Forward Search in the regression context under different perspectives, motivated by issues encountered in concrete application contexts related to the analysis of international trade data. The main contributions of the thesis can be summarised as follows. Firstly, it is investigated how the Forward Search achieves its nominal size and how it faces with the multiple testing issue. Secondly, the Forward Search algorithm is relaxed in order to identify outliers with arbitrary significance levels, other than the standard 1% that is inherent to the method. Thirdly, it reports the results of a rigorous and extensive assessment of the actual size and power of the Forward Search in comparison with the today reference methods in robust regression (LMS and LTS). The results empirically prove that the Forward Search can achieve at the same time high power and small size. Fourthly, the Forward Search is extended to identify and validate homogeneous sub-populations in the data that, in regression, manifest as mixtures of linear components. Finally the Forward Search is addressed from the exploratory data analysis perspective, discussing new dynamic and interactive graphical tools aimed at extracting information from the numerous plots produced by the Forward Search.

(2010). Advances in the forward search: methodological and applied contributions. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2010).

Advances in the forward search: methodological and applied contributions

TORTI, FRANCESCA
2010

Abstract

The thesis is about the Forward Search, an approach to robust data analysis proposed and developed in the last 15 years mainly by Atkinson, Riani and Cerioli. It is a general method for detecting unidentified subsets and masked outliers in complex data and for determining their effect on models fitted to the data. This thesis approaches the Forward Search in the regression context under different perspectives, motivated by issues encountered in concrete application contexts related to the analysis of international trade data. The main contributions of the thesis can be summarised as follows. Firstly, it is investigated how the Forward Search achieves its nominal size and how it faces with the multiple testing issue. Secondly, the Forward Search algorithm is relaxed in order to identify outliers with arbitrary significance levels, other than the standard 1% that is inherent to the method. Thirdly, it reports the results of a rigorous and extensive assessment of the actual size and power of the Forward Search in comparison with the today reference methods in robust regression (LMS and LTS). The results empirically prove that the Forward Search can achieve at the same time high power and small size. Fourthly, the Forward Search is extended to identify and validate homogeneous sub-populations in the data that, in regression, manifest as mixtures of linear components. Finally the Forward Search is addressed from the exploratory data analysis perspective, discussing new dynamic and interactive graphical tools aimed at extracting information from the numerous plots produced by the Forward Search.
RIANI, MARCO
robust statistic, Forward Search, outlier, regression, linear mixture, clustering, statistical visualisation, international trade, anti-fraud
SECS-S/01 - STATISTICA
English
25-gen-2010
STATISTICA - 11R
22
2008/2009
open
(2010). Advances in the forward search: methodological and applied contributions. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2010).
File in questo prodotto:
File Dimensione Formato  
phd_unimib_707887.pdf

accesso aperto

Tipologia di allegato: Doctoral thesis
Dimensione 9.5 MB
Formato Adobe PDF
9.5 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/7791
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact