Bicocca Open Archive

Bayesian optimization due to its flexibility and sample efficiency has become a standard approach for simulation optimization. To reduce this problem, one can resort to cheaper surrogates of the objective function. Examples are ubiquitous, from protein engineering or material science to tuning machine learning algorithms, where one could use a subset of the full training set or even a smaller related dataset. Cheap information sources in the optimization scheme have been studied in the literature as the multi-fidelity optimization problem. Of course, cheaper sources may hold some promise toward tractability, but cheaper models offer an incomplete model inducing unknown bias and epistemic uncertainty. In this manuscript, we are concerned with the discrete case, where (Formula presented.) is the value of the performance measure associated with the environmental condition (Formula presented.) and (Formula presented.) represents the relevance of the condition (Formula presented.) (i.e., the probability of occurrence or the fraction of time this condition occurs). The main contribution of this paper is the proposal of a Gaussian-based framework, called augmented Gaussian process (AGP), based on sparsification, originally proposed for continuous functions and its generalization in this paper to stochastic optimization using different risk profiles for combinatorial optimization. The AGP leverages sample and cost-efficient Bayesian optimization (BO) of multiple information sources and supports a new acquisition function to select the new source–location pair considering the cost of the source and the (location-dependent) model discrepancy. An extensive set of computational results supports risk-aware optimization based on CVaR (conditional value-at-risk). Computational experiments confirm the actual performance of the MISO-AGP method and the hyperparameter optimization on benchmark functions and real-world problems.

Sabbatella, A., Ponti, A., Candelieri, A., Archetti, F. (2024). Bayesian Optimization Using Simulation-Based Multiple Information Sources over Combinatorial Structures. MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 6(4), 2232-2247 [10.3390/make6040110].

Bayesian Optimization Using Simulation-Based Multiple Information Sources over Combinatorial Structures

Sabbatella A.;Ponti A.;Candelieri A.;Archetti F.

2024

Abstract

Bayesian optimization due to its flexibility and sample efficiency has become a standard approach for simulation optimization. To reduce this problem, one can resort to cheaper surrogates of the objective function. Examples are ubiquitous, from protein engineering or material science to tuning machine learning algorithms, where one could use a subset of the full training set or even a smaller related dataset. Cheap information sources in the optimization scheme have been studied in the literature as the multi-fidelity optimization problem. Of course, cheaper sources may hold some promise toward tractability, but cheaper models offer an incomplete model inducing unknown bias and epistemic uncertainty. In this manuscript, we are concerned with the discrete case, where (Formula presented.) is the value of the performance measure associated with the environmental condition (Formula presented.) and (Formula presented.) represents the relevance of the condition (Formula presented.) (i.e., the probability of occurrence or the fraction of time this condition occurs). The main contribution of this paper is the proposal of a Gaussian-based framework, called augmented Gaussian process (AGP), based on sparsification, originally proposed for continuous functions and its generalization in this paper to stochastic optimization using different risk profiles for combinatorial optimization. The AGP leverages sample and cost-efficient Bayesian optimization (BO) of multiple information sources and supports a new acquisition function to select the new source–location pair considering the cost of the source and the (location-dependent) model discrepancy. An extensive set of computational results supports risk-aware optimization based on CVaR (conditional value-at-risk). Computational experiments confirm the actual performance of the MISO-AGP method and the hyperparameter optimization on benchmark functions and real-world problems.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Bayesian optimization; combinatorial optimization; information sources; multi-fidelity; network design; simulation; value-at-risk;
			
	Lingua del contenuto
	
				English
			
	Data ahead of print o Data prima pubblicazione Online
	
				5-ott-2024
			
	Data di pubblicazione
	
				2024
			
	Rivista
	
				MACHINE LEARNING AND KNOWLEDGE EXTRACTION
			
	Numero del volume
	
				6
			
	Fascicolo
	
				4
			
	Pagina iniziale
	
				2232
			
	Pagina finale
	
				2247
			
	DOI dell'articolo
	
				https://dx.doi.org/10.3390/make6040110
			
	Fulltext
	
				open
			
	Citazione
	
				Sabbatella, A., Ponti, A., Candelieri, A., Archetti, F. (2024). Bayesian Optimization Using Simulation-Based Multiple Information Sources over Combinatorial Structures. MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 6(4), 2232-2247 [10.3390/make6040110].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Sabbatella-2024-Machine Learning and Knowledge Extraction-VoR.pdf accesso aperto Descrizione: This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 6.34 MB Formato Adobe PDF Visualizza/Apri	6.34 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/551725

Citazioni

2

2

Social impact