Bicocca Open Archive

The estimation of accuracy and applicability of QSAR and QSPR models for biological and physicochemical properties represents a critical problem. The developed parameter of “distance to model” (DM) is defined as a metric of similarity between the training and test set compounds that have been subjected to QSAR/QSPR modeling. In our previous work, we demonstrated the utility and optimal performance of DM metrics that have been based on the standard deviation within an ensemble of QSAR models. The current study applies such analysis to 30 QSAR models for the Ames mutagenicity data set that were previously reported within the 2009 QSAR challenge. We demonstrate that the DMs based on an ensemble (consensus) model provide systematically better performance than other DMs. The presented approach identifies 30-60% of compounds having an accuracy of prediction similar to the interlaboratory accuracy of the Ames test, which is estimated to be 90%. Thus, the in silico predictions can be used to halve the cost of experimental measurements by providing a similar prediction accuracy. The developed model has been made publicly available at http://ochem.eu/models/1

Sushko, I., Novotarskyi, S., Körner, R., Pandey, A., Cherkasov, A., Li, J., et al. (2010). Applicability Domains for Classification Problems: Benchmarking of Distance to Models for Ames Mutagenicity Set. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 50(12), 2094-2111 [10.1021/ci100253r].

Applicability Domains for Classification Problems: Benchmarking of Distance to Models for Ames Mutagenicity Set

Sushko, I;Novotarskyi, S;Körner, R;Pandey, AK;Cherkasov, A;Li, J;Gramatica, P;Hansen, K;Schroeter, T;Müller, KR;Xi, L;Liu, H;Yao, X;Öberg, T;Hormozdiari, F;Dao, P;Sahinalp, C;TODESCHINI, ROBERTO;Polishchuk, P;Artemenko, A;Kuz'min, V;Martin, TM;Young, DM;Fourches, D;Tropsha, A;Baskin, I;Horbath, D;Marcou, G;Varnek, A;Prokopenko, VV;Tetko, IV

2010

Abstract

The estimation of accuracy and applicability of QSAR and QSPR models for biological and physicochemical properties represents a critical problem. The developed parameter of “distance to model” (DM) is defined as a metric of similarity between the training and test set compounds that have been subjected to QSAR/QSPR modeling. In our previous work, we demonstrated the utility and optimal performance of DM metrics that have been based on the standard deviation within an ensemble of QSAR models. The current study applies such analysis to 30 QSAR models for the Ames mutagenicity data set that were previously reported within the 2009 QSAR challenge. We demonstrate that the DMs based on an ensemble (consensus) model provide systematically better performance than other DMs. The presented approach identifies 30-60% of compounds having an accuracy of prediction similar to the interlaboratory accuracy of the Ames test, which is estimated to be 90%. Thus, the in silico predictions can be used to halve the cost of experimental measurements by providing a similar prediction accuracy. The developed model has been made publicly available at http://ochem.eu/models/1

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				QSAR; mutagenicity; Ames test
			
	Lingua del contenuto
	
				English
			
	Data di pubblicazione
	
				2010
			
	Rivista
	
				JOURNAL OF CHEMICAL INFORMATION AND MODELING
			
	Numero del volume
	
				50
			
	Fascicolo
	
				12
			
	Pagina iniziale
	
				2094
			
	Pagina finale
	
				2111
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1021/ci100253r
			
	Fulltext
	
				none
			
	Citazione
	
				Sushko, I., Novotarskyi, S., Körner, R., Pandey, A., Cherkasov, A., Li, J., et al. (2010). Applicability Domains for Classification Problems: Benchmarking of Distance to Models for Ames Mutagenicity Set. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 50(12), 2094-2111 [10.1021/ci100253r].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/44669

Citazioni

217

209

Social impact