The thesis is about the Forward Search, an approach to robust data analysis proposed and developed in the last 15 years mainly by Atkinson, Riani and Cerioli. It is a general method for detecting unidentified subsets and masked outliers in complex data and for determining their effect on models fitted to the data. This thesis approaches the Forward Search in the regression context under different perspectives, motivated by issues encountered in concrete application contexts related to the analysis of international trade data. The main contributions of the thesis can be summarised as follows. Firstly, it is investigated how the Forward Search achieves its nominal size and how it faces with the multiple testing issue. Secondly, the Forward Search algorithm is relaxed in order to identify outliers with arbitrary significance levels, other than the standard 1% that is inherent to the method. Thirdly, it reports the results of a rigorous and extensive assessment of the actual size and power of the Forward Search in comparison with the today reference methods in robust regression (LMS and LTS). The results empirically prove that the Forward Search can achieve at the same time high power and small size. Fourthly, the Forward Search is extended to identify and validate homogeneous sub-populations in the data that, in regression, manifest as mixtures of linear components. Finally the Forward Search is addressed from the exploratory data analysis perspective, discussing new dynamic and interactive graphical tools aimed at extracting information from the numerous plots produced by the Forward Search.
(2010). Advances in the forward search: methodological and applied contributions. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2010).
|Data di pubblicazione:||25-gen-2010|
|Tutor esterno:||RIANI, MARCO|
|Titolo:||Advances in the forward search: methodological and applied contributions|
|Settore Scientifico Disciplinare:||SECS-S/01 - STATISTICA|
|Corso di dottorato:||STATISTICA - 11R|
|Citazione:||(2010). Advances in the forward search: methodological and applied contributions. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2010).|
|Parole Chiave:||robust statistic, Forward Search, outlier, regression, linear mixture, clustering, statistical visualisation, international trade, anti-fraud|
|Appare nelle tipologie:||07 - Tesi di dottorato Bicocca post 2009|