Text modeling techniques have been used in a variety of applications in the past, among which is the analysis of documents for latent topics detection. One of the most commonly used tools for topic modeling is the latent Dirichlet allocation (LDA). The LDA allows us to obtain a representation of a document in terms of latent topic structure. This chapter proposes a collapsed Gibbs sampling algorithm for the estimation of the relevant flexible LDA (FLDA) parameters and find relevant insights into topic distributions over documents. It presents the considered distributions on the simplex, such as the Dirichlet and the flexible Dirichlet. The chapter introduces latent topic models with a focus on the LDA and the FLDA. It describes the main sampling schemes and estimation procedures used for both of the latent topic models. The chapter shows the performances of the new FLDA model through some extensive simulation studies.

Giampino, A., Ascari, R., Migliorati, S. (2024). A Flexible Generalization of the Latent Dirichlet Allocation. In Y. Dimotikalis, C.H. Skiadas (a cura di), Data Analysis and Related Applications 4: New Approaches: Volume 12 (pp. 109-122). Wiley [10.1002/9781394316915.ch8].

A Flexible Generalization of the Latent Dirichlet Allocation

Giampino A.
;
Ascari R.;Migliorati S.
2024

Abstract

Text modeling techniques have been used in a variety of applications in the past, among which is the analysis of documents for latent topics detection. One of the most commonly used tools for topic modeling is the latent Dirichlet allocation (LDA). The LDA allows us to obtain a representation of a document in terms of latent topic structure. This chapter proposes a collapsed Gibbs sampling algorithm for the estimation of the relevant flexible LDA (FLDA) parameters and find relevant insights into topic distributions over documents. It presents the considered distributions on the simplex, such as the Dirichlet and the flexible Dirichlet. The chapter introduces latent topic models with a focus on the LDA and the FLDA. It describes the main sampling schemes and estimation procedures used for both of the latent topic models. The chapter shows the performances of the new FLDA model through some extensive simulation studies.
Capitolo o saggio
collapsed Gibbs sampling; flexible Dirichlet; latent Dirichlet allocation; latent topic modeling; simulation studies;
English
Data Analysis and Related Applications 4: New Approaches: Volume 12
Dimotikalis, Y; Skiadas, CH
2024
9781786309921
Wiley
109
122
Giampino, A., Ascari, R., Migliorati, S. (2024). A Flexible Generalization of the Latent Dirichlet Allocation. In Y. Dimotikalis, C.H. Skiadas (a cura di), Data Analysis and Related Applications 4: New Approaches: Volume 12 (pp. 109-122). Wiley [10.1002/9781394316915.ch8].
reserved
File in questo prodotto:
File Dimensione Formato  
Giampino-2024-Data Analysis and Related Applications 4: New Approaches: Volume 12-VoR.pdf

Solo gestori archivio

Descrizione: A Flexible Generalization of the Latent Dirichlet Allocation
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Tutti i diritti riservati
Dimensione 1.98 MB
Formato Adobe PDF
1.98 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/525821
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact