Gaussian Process based Bayesian Optimization is largely adopted for solving problems where the inputs are in Euclidean spaces. In this paper we associate the inputs to discrete probability distributions which are elements of the probability simplex. To search in the new design space, we need a distance between distributions. The optimal transport distance (aka Wasserstein distance) is chosen due to its mathematical structure and the computational strategies enabled by it. Both the GP and the acquisition function is generalized to an acquisition functional over the probability simplex. To optimize this functional two methods are proposed, one based on auto differentiation and the other based on proximal-point algorithm and the gradient flow. Finally, we report a preliminary set of computational results on a class of problems whose dimension ranges from 5 to 100. These results show that embedding the Bayesian optimization process in the probability simplex enables an effective algorithm whose performance over standard Bayesian optimization improves with the increase of problem dimensionality.
Candelieri, A., Ponti, A., Archetti, F. (2025). Bayesian optimization over the probability simplex. ANNALS OF MATHEMATICS AND OF ARTIFICIAL INTELLIGENCE, 93(1), 77-91 [10.1007/s10472-023-09883-w].
Bayesian optimization over the probability simplex
Candelieri A.;Ponti A.
;Archetti F.
2025
Abstract
Gaussian Process based Bayesian Optimization is largely adopted for solving problems where the inputs are in Euclidean spaces. In this paper we associate the inputs to discrete probability distributions which are elements of the probability simplex. To search in the new design space, we need a distance between distributions. The optimal transport distance (aka Wasserstein distance) is chosen due to its mathematical structure and the computational strategies enabled by it. Both the GP and the acquisition function is generalized to an acquisition functional over the probability simplex. To optimize this functional two methods are proposed, one based on auto differentiation and the other based on proximal-point algorithm and the gradient flow. Finally, we report a preliminary set of computational results on a class of problems whose dimension ranges from 5 to 100. These results show that embedding the Bayesian optimization process in the probability simplex enables an effective algorithm whose performance over standard Bayesian optimization improves with the increase of problem dimensionality.| File | Dimensione | Formato | |
|---|---|---|---|
|
unpaywall-bitstream-1269884629.pdf
accesso aperto
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
1.71 MB
Formato
Adobe PDF
|
1.71 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


