In this paper, we investigate the impact of optimal hyper-parameter configuration in relational topic models. The main goal is to validate the hypothesis that single-objective Bayesian Optimization (BO) can discover a hyper-parameter setting that leads a set of relational topic models to simultaneously ensure good prediction capabilities and significant topics from a qualitative perspective. Our research, as a result of a comparative analysis performed on 7 state-of-the-art models, 5 performance measures and 3 datasets, has highlighted three main findings: (1) the majority of relational topic models are not able to offer a good trade-off between classification capabilities and topic interpretability; (2) single-objective optimization of hyper-parameters, targeted on maximizing the F1-Measure, is able to create topics that are also optimal with respect to the Kullback Leibler divergence measure; (3) the Pareto frontiers across several performance metrics reveals that the most promising trade-off between the performance metrics can be obtained by Constrained Relational Topic Models.

Terragni, S., Candelieri, A., Fersini, E. (2023). The role of hyper-parameters in relational topic models: Prediction capabilities vs topic quality. INFORMATION SCIENCES, 632(June 2023), 252-268 [10.1016/j.ins.2023.02.076].

The role of hyper-parameters in relational topic models: Prediction capabilities vs topic quality

Terragni S.;Candelieri A.;Fersini E.
2023

Abstract

In this paper, we investigate the impact of optimal hyper-parameter configuration in relational topic models. The main goal is to validate the hypothesis that single-objective Bayesian Optimization (BO) can discover a hyper-parameter setting that leads a set of relational topic models to simultaneously ensure good prediction capabilities and significant topics from a qualitative perspective. Our research, as a result of a comparative analysis performed on 7 state-of-the-art models, 5 performance measures and 3 datasets, has highlighted three main findings: (1) the majority of relational topic models are not able to offer a good trade-off between classification capabilities and topic interpretability; (2) single-objective optimization of hyper-parameters, targeted on maximizing the F1-Measure, is able to create topics that are also optimal with respect to the Kullback Leibler divergence measure; (3) the Pareto frontiers across several performance metrics reveals that the most promising trade-off between the performance metrics can be obtained by Constrained Relational Topic Models.
Articolo in rivista - Articolo scientifico
Evaluation; Hyper-parameter tuning; Topic modelling;
English
2-mar-2023
2023
632
June 2023
252
268
none
Terragni, S., Candelieri, A., Fersini, E. (2023). The role of hyper-parameters in relational topic models: Prediction capabilities vs topic quality. INFORMATION SCIENCES, 632(June 2023), 252-268 [10.1016/j.ins.2023.02.076].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/408458
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact