Peptide sequences with different lengths, available from synthesised peptide libraries and sequenced proteins, are potentially valuable for evaluating structure-activity relationships. However, in order to apply multivariate regression and classification models on such sequences, it is necessary to have a preprocessing method that translates them into a uniform set of variables. A molecular descriptor based approach can be suitable for the characterisation of peptide sequences and the prediction of their chemical or biological properties. In this paper a novel methodology based on traditional molecular descriptors calculated on a simplified representation of peptides and proteins has been evaluated. This representation avoid problems related to molecular size and information redundancy due to the common structural features of every amino acid. The proposed methodology has been successfully applied on a peptide data set taken from the literature.
Mauri, A., Ballabio, D., Consonni, V., Manganaro, A., Todeschini, R. (2008). Peptides multivariate characterisation using a molecular descriptor based approach. MATCH, 60, 671-690.
Peptides multivariate characterisation using a molecular descriptor based approach
MAURI, ANDREA;BALLABIO, DAVIDE;CONSONNI, VIVIANA;TODESCHINI, ROBERTO
2008
Abstract
Peptide sequences with different lengths, available from synthesised peptide libraries and sequenced proteins, are potentially valuable for evaluating structure-activity relationships. However, in order to apply multivariate regression and classification models on such sequences, it is necessary to have a preprocessing method that translates them into a uniform set of variables. A molecular descriptor based approach can be suitable for the characterisation of peptide sequences and the prediction of their chemical or biological properties. In this paper a novel methodology based on traditional molecular descriptors calculated on a simplified representation of peptides and proteins has been evaluated. This representation avoid problems related to molecular size and information redundancy due to the common structural features of every amino acid. The proposed methodology has been successfully applied on a peptide data set taken from the literature.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.