Till date several thousands of molecular descriptors have been proposed in the scientific literature and most of them are easily calculated from molecular structures with the aid of a number of dedicated computational tools. Then, a description of molecules by a lot of numerical indices currently is a task that can be easily accomplished, but afterward efforts have to be addressed to dealing with so large amount of chemical information. Also the most common multivariate statistical analysis techniques sometimes fail in dealing with very large data sets being comprised of highly correlated variables. Then, a future challenge is overcoming limitations of existing techniques by implementing novel multivariate approaches able to analyze data structures, extract useful information, and establish robust predictive models. The objective of this chapter is to investigate the chemical information encompassed by molecular descriptors derived from graph-theoretical matrices and elucidate their role in quantitative structure-activity relationship (QSAR) and drug design. The chapter will focus first on reviewing the different types of 2D matrix-based descriptors proposed in the literature till date. Then, some methodological topics related to multivariate data analysis will be overviewed paying particular attention to the analysis of similarity/diversity of chemical spaces. The last part of the chapter will deal with application of 2D matrix-based descriptors to study similarity relationships of QSAR data sets. © 2012 Wiley-VCH Verlag GmbH & Co. KGaA. All rights reserved.
Consonni, V., Todeschini, R. (2012). Multivariate Analysis of Molecular Descriptors. In M. Dehmer, K. Varmuza, D. Bonchev (a cura di), Statistical Modelling of Molecular Descriptors in QSAR/QSPR (pp. 111-147). Weinheim : Wiley-Blackwell [10.1002/9783527645121.ch4].
Multivariate Analysis of Molecular Descriptors
CONSONNI, VIVIANA;TODESCHINI, ROBERTO
2012
Abstract
Till date several thousands of molecular descriptors have been proposed in the scientific literature and most of them are easily calculated from molecular structures with the aid of a number of dedicated computational tools. Then, a description of molecules by a lot of numerical indices currently is a task that can be easily accomplished, but afterward efforts have to be addressed to dealing with so large amount of chemical information. Also the most common multivariate statistical analysis techniques sometimes fail in dealing with very large data sets being comprised of highly correlated variables. Then, a future challenge is overcoming limitations of existing techniques by implementing novel multivariate approaches able to analyze data structures, extract useful information, and establish robust predictive models. The objective of this chapter is to investigate the chemical information encompassed by molecular descriptors derived from graph-theoretical matrices and elucidate their role in quantitative structure-activity relationship (QSAR) and drug design. The chapter will focus first on reviewing the different types of 2D matrix-based descriptors proposed in the literature till date. Then, some methodological topics related to multivariate data analysis will be overviewed paying particular attention to the analysis of similarity/diversity of chemical spaces. The last part of the chapter will deal with application of 2D matrix-based descriptors to study similarity relationships of QSAR data sets. © 2012 Wiley-VCH Verlag GmbH & Co. KGaA. All rights reserved.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.