Neural Topic Models are recent neural models that aim at extracting the main themes from a collection of documents. The comparison of these models is usually limited because the hyperparameters are held fixed. In this paper, we present an empirical analysis and comparison of Neural Topic Models by finding the optimal hyperparameters of each model for four different performance measures adopting a single-objective Bayesian optimization. This allows us to determine the robustness of a topic model for several evaluation metrics. We also empirically show the effect of the length of the documents on different optimized metrics and discover which evaluation metrics are in conflict or agreement with each other.

Terragni, S., Fersini, E. (2021). An Empirical Analysis of Topic Models: Uncovering the Relationships between Hyperparameters, Document Length and Performance Measures. In International Conference Recent Advances in Natural Language Processing, RANLP (pp.1408-1416). Incoma Ltd [10.26615/978-954-452-072-4_157].

An Empirical Analysis of Topic Models: Uncovering the Relationships between Hyperparameters, Document Length and Performance Measures

Terragni, S
;
Fersini, E
2021

Abstract

Neural Topic Models are recent neural models that aim at extracting the main themes from a collection of documents. The comparison of these models is usually limited because the hyperparameters are held fixed. In this paper, we present an empirical analysis and comparison of Neural Topic Models by finding the optimal hyperparameters of each model for four different performance measures adopting a single-objective Bayesian optimization. This allows us to determine the robustness of a topic model for several evaluation metrics. We also empirically show the effect of the length of the documents on different optimized metrics and discover which evaluation metrics are in conflict or agreement with each other.
paper
Topic Models; Hyperparameters; Performance Measures;
English
International Conference on Recent Advances in Natural Language Processing: Deep Learning for Natural Language Processing Methods and Applications, RANLP 2021 - 1 September 2021 through 3 September 2021
2021
Angelova, G; Kunilovskaya, M; Mitkov, R; Nikolova-Koleva, I
International Conference Recent Advances in Natural Language Processing, RANLP
978-954452072-4
2021
1408
1416
none
Terragni, S., Fersini, E. (2021). An Empirical Analysis of Topic Models: Uncovering the Relationships between Hyperparameters, Document Length and Performance Measures. In International Conference Recent Advances in Natural Language Processing, RANLP (pp.1408-1416). Incoma Ltd [10.26615/978-954-452-072-4_157].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/363090
Citazioni
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
Social impact