Bayesian Optimization has become the reference method for the global optimization of black box, expensive and possibly noisy functions. Bayesian Optimization learns a probabilistic model about the objective function, usually a Gaussian Process, and builds, depending on its mean and variance, an acquisition function whose optimizer yields the new evaluation point, leading to update the probabilistic surrogate model. Despite its sample efficiency, Bayesian Optimization does not scale well with the dimensions of the problem. Moreover, the optimization of the acquisition function has received less attention because its computational cost is usually considered negligible compared to that of the evaluation of the objective function: its efficient optimization is also inhibited, particularly in high dimensional problems, by multiple extrema and “flat” regions. In this paper we leverage the additivity – aka separability – of the objective function into mapping both the kernel and the acquisition function of the Bayesian Optimization in lower dimensional subspaces. This approach makes more efficient both the learning/updating of the probabilistic surrogate model and the optimization of the acquisition function. Experimental results are presented for a standard test function and a real-life application.
Candelieri, A., Giordani, I., Perego, R., Archetti, F. (2020). Composition of kernel and acquisition functions for high dimensional bayesian optimization. In Learning and Intelligent Optimization 14th International Conference, LION 14, Athens, Greece, May 24–28, 2020, Revised Selected Papers (pp.316-323). Springer [10.1007/978-3-030-53552-0_29].
Composition of kernel and acquisition functions for high dimensional bayesian optimization
Candelieri, A
;Giordani, I;Perego, R;Archetti, F
2020
Abstract
Bayesian Optimization has become the reference method for the global optimization of black box, expensive and possibly noisy functions. Bayesian Optimization learns a probabilistic model about the objective function, usually a Gaussian Process, and builds, depending on its mean and variance, an acquisition function whose optimizer yields the new evaluation point, leading to update the probabilistic surrogate model. Despite its sample efficiency, Bayesian Optimization does not scale well with the dimensions of the problem. Moreover, the optimization of the acquisition function has received less attention because its computational cost is usually considered negligible compared to that of the evaluation of the objective function: its efficient optimization is also inhibited, particularly in high dimensional problems, by multiple extrema and “flat” regions. In this paper we leverage the additivity – aka separability – of the objective function into mapping both the kernel and the acquisition function of the Bayesian Optimization in lower dimensional subspaces. This approach makes more efficient both the learning/updating of the probabilistic surrogate model and the optimization of the acquisition function. Experimental results are presented for a standard test function and a real-life application.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.