Predicting the toxicity of new potential drugs is a fundamental step in the drug design process. Recent contributions have shown that, even though Genetic Programming is a promising method for this task, the problem of predicting the toxicity of molecular compounds is complex and difficult to solve. In particular, when executed for predicting drug toxicity, Genetic Programming undergoes the well-known phenomenon of bloat, i.e. the growth in code size during the evolutionary process without a corresponding improvement in fitness. We hypothesize that this might cause overfitting and thus prevent the method from discovering simpler and potentially more general solutions. For this reason, in this paper we investigate two recently defined variants of the operator equalization bloat control method for Genetic Programming. We show that these two methods are bloat free also when executed on this complex problem. Nevertheless, overfitting still remains an issue. Thus, contradicting the generalized idea that bloat and overfitting are strongly related, we argue that the two phenomena are independent from each other and that eliminating bloat does not necessarily eliminate overfitting. © 2009 Springer Berlin Heidelberg.

Silva, S., Vanneschi, L. (2009). Using operator equalisation for prediction of drug toxicity with genetic programming. In Progress in Artificial Intelligence 14th Portuguese Conference on Artificial Intelligence, EPIA 2009, Aveiro, Portugal, October 12-15, 2009. Proceedings. (pp.65-76). Springer [10.1007/978-3-642-04686-5_6].

Using operator equalisation for prediction of drug toxicity with genetic programming

VANNESCHI, LEONARDO
2009

Abstract

Predicting the toxicity of new potential drugs is a fundamental step in the drug design process. Recent contributions have shown that, even though Genetic Programming is a promising method for this task, the problem of predicting the toxicity of molecular compounds is complex and difficult to solve. In particular, when executed for predicting drug toxicity, Genetic Programming undergoes the well-known phenomenon of bloat, i.e. the growth in code size during the evolutionary process without a corresponding improvement in fitness. We hypothesize that this might cause overfitting and thus prevent the method from discovering simpler and potentially more general solutions. For this reason, in this paper we investigate two recently defined variants of the operator equalization bloat control method for Genetic Programming. We show that these two methods are bloat free also when executed on this complex problem. Nevertheless, overfitting still remains an issue. Thus, contradicting the generalized idea that bloat and overfitting are strongly related, we argue that the two phenomena are independent from each other and that eliminating bloat does not necessarily eliminate overfitting. © 2009 Springer Berlin Heidelberg.
paper
operator, equalisation, prediction, drug, toxicity, genetic, programming
English
14th Portuguese Conference on Artificial Intelligence, EPIA 2009, Aveiro, Portugal, October 12-15
2009
Lopes, LS; Lau, N; Mariano, P; Rocha, LM
Progress in Artificial Intelligence 14th Portuguese Conference on Artificial Intelligence, EPIA 2009, Aveiro, Portugal, October 12-15, 2009. Proceedings.
978-3-642-04685-8
2009
5816 LNAI
65
76
none
Silva, S., Vanneschi, L. (2009). Using operator equalisation for prediction of drug toxicity with genetic programming. In Progress in Artificial Intelligence 14th Portuguese Conference on Artificial Intelligence, EPIA 2009, Aveiro, Portugal, October 12-15, 2009. Proceedings. (pp.65-76). Springer [10.1007/978-3-642-04686-5_6].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/16075
Citazioni
  • Scopus 13
  • ???jsp.display-item.citation.isi??? 10
Social impact