Determining the redox potentials of protein cofactors and how they are influenced by their molecular neighborhoods is essential for basic research and many biotechnological applications, from biosensors and biocatalysis to bioremediation and bioelectronics. The laborious determination of redox potential with current experimental technologies pushes forward the need for computational approaches that can reliably predict it. Although current computational approaches based on quantum and molecular mechanics are accurate, their large computational costs hinder their usage. In this work, we explored the possibility of using more efficient QSPR models based on machine learning (ML) for the prediction of protein redox potential, as an alternative to classical approaches. As a proof of concept, we focused on flavoproteins, one of the most important families of enzymes directly involved in redox processes. To train and test different ML models, we retrieved a dataset of flavoproteins with a known midpoint redox potential (Em) and 3D structure. The features of interest, accounting for both short- and long-range effects of the protein matrix on the flavin cofactor, have been automatically extracted from each protein PDB file. Our best ML model (XGB) has a performance error below 1 kcal/mol (∼36 mV), comparing favorably to more sophisticated computational approaches. We also provided indications on the features that mostly affect the Emvalue, and when possible, we rationalized them on the basis of previous studies.

Galuzzi, B., Mirarchi, A., Viganò, E., De Gioia, L., Damiani, C., Arrigoni, F. (2022). Machine Learning for Efficient Prediction of Protein Redox Potential: The Flavoproteins Case. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 62(19), 4748-4759 [10.1021/acs.jcim.2c00858].

Machine Learning for Efficient Prediction of Protein Redox Potential: The Flavoproteins Case

Galuzzi, Bruno Giovanni
Co-primo
;
Mirarchi, Antonio
Co-primo
;
De Gioia, Luca;Damiani, Chiara
;
Arrigoni, Federica
2022

Abstract

Determining the redox potentials of protein cofactors and how they are influenced by their molecular neighborhoods is essential for basic research and many biotechnological applications, from biosensors and biocatalysis to bioremediation and bioelectronics. The laborious determination of redox potential with current experimental technologies pushes forward the need for computational approaches that can reliably predict it. Although current computational approaches based on quantum and molecular mechanics are accurate, their large computational costs hinder their usage. In this work, we explored the possibility of using more efficient QSPR models based on machine learning (ML) for the prediction of protein redox potential, as an alternative to classical approaches. As a proof of concept, we focused on flavoproteins, one of the most important families of enzymes directly involved in redox processes. To train and test different ML models, we retrieved a dataset of flavoproteins with a known midpoint redox potential (Em) and 3D structure. The features of interest, accounting for both short- and long-range effects of the protein matrix on the flavin cofactor, have been automatically extracted from each protein PDB file. Our best ML model (XGB) has a performance error below 1 kcal/mol (∼36 mV), comparing favorably to more sophisticated computational approaches. We also provided indications on the features that mostly affect the Emvalue, and when possible, we rationalized them on the basis of previous studies.
Articolo in rivista - Articolo scientifico
Machine Learning; Oxidation-Reduction; Flavins; Flavoproteins
English
20-set-2022
2022
62
19
4748
4759
open
Galuzzi, B., Mirarchi, A., Viganò, E., De Gioia, L., Damiani, C., Arrigoni, F. (2022). Machine Learning for Efficient Prediction of Protein Redox Potential: The Flavoproteins Case. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 62(19), 4748-4759 [10.1021/acs.jcim.2c00858].
File in questo prodotto:
File Dimensione Formato  
10281-396336_VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 6.75 MB
Formato Adobe PDF
6.75 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/396336
Citazioni
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 11
Social impact