Over the past few decades, a broad variety of models has been developed for graphs. However, modern applications in various fields highlighted the need to account for higher-order interactions, to include information deriving from groups of three or more nodes. Simple examples include group interactions in social networks, scientific co-authorship, interactions between more than two species in ecological models or high-order correlations between neurons in brain networks. Hypergraphs provide the most general formalization of higher-order interactions: similarly to a graph, a hypergraph is defined as a set of nodes and a set of hyperedges, the latter specifying nodes taking part in each interaction. We propose a stochastic block model for hypergraphs to perform model-based clustering, capturing the information deriving from higher-order interactions. A discrete latent variable with Q support points is associated to each node, identifying the latent states in the population. The model parameters are the weight of each latent state, and the occurrence probability of a hyperedge given the belonging latent states of its nodes. The formulation of the model is sufficiently flexible to account for possible simplified latent structures; an example is the situation in which the conditional probability of occurrence of an hyperedge can only assume two possible values: one if all its nodes belong to the same latent state, and the other otherwise. Maximum likelihood estimation of model parameters is performed through a variational expectation-maximization algorithm, by maximizing a lower bound of the log-likelihood function. Spectral clustering techniques are employed to provide an optimal initialization to the algorithm, and model selection is explored using the ICL criterion. The model is applied to both simulated and real data, and the performance of the proposal is assessed in terms of parameter estimation and ability to recover the clusters (through the Adjusted Rand Index). The estimation algorithm is implemented in C++ language (both in serial and in parallel version) and it is made available for the R software.

Matias, C., Brusa, L. (2022). Model-based clustering in hypergraphs through a stochastic blockmodel. In International Conference APPLIED STATISTICS 2022 - Abstracts and Program (pp.29-29).

Model-based clustering in hypergraphs through a stochastic blockmodel

Brusa, L
2022

Abstract

Over the past few decades, a broad variety of models has been developed for graphs. However, modern applications in various fields highlighted the need to account for higher-order interactions, to include information deriving from groups of three or more nodes. Simple examples include group interactions in social networks, scientific co-authorship, interactions between more than two species in ecological models or high-order correlations between neurons in brain networks. Hypergraphs provide the most general formalization of higher-order interactions: similarly to a graph, a hypergraph is defined as a set of nodes and a set of hyperedges, the latter specifying nodes taking part in each interaction. We propose a stochastic block model for hypergraphs to perform model-based clustering, capturing the information deriving from higher-order interactions. A discrete latent variable with Q support points is associated to each node, identifying the latent states in the population. The model parameters are the weight of each latent state, and the occurrence probability of a hyperedge given the belonging latent states of its nodes. The formulation of the model is sufficiently flexible to account for possible simplified latent structures; an example is the situation in which the conditional probability of occurrence of an hyperedge can only assume two possible values: one if all its nodes belong to the same latent state, and the other otherwise. Maximum likelihood estimation of model parameters is performed through a variational expectation-maximization algorithm, by maximizing a lower bound of the log-likelihood function. Spectral clustering techniques are employed to provide an optimal initialization to the algorithm, and model selection is explored using the ICL criterion. The model is applied to both simulated and real data, and the performance of the proposal is assessed in terms of parameter estimation and ability to recover the clusters (through the Adjusted Rand Index). The estimation algorithm is implemented in C++ language (both in serial and in parallel version) and it is made available for the R software.
abstract
Networks; Latent variable model; Variational EM algorithm
English
International Conference APPLIED STATISTICS 2022
2022
Kastrin Andrej; Lusa Lara
International Conference APPLIED STATISTICS 2022 - Abstracts and Program
978-961-94283-2-0
2022
29
29
https://akastrin.si/as/as-book-2022.pdf
open
Matias, C., Brusa, L. (2022). Model-based clustering in hypergraphs through a stochastic blockmodel. In International Conference APPLIED STATISTICS 2022 - Abstracts and Program (pp.29-29).
File in questo prodotto:
File Dimensione Formato  
Matias-2022-Appl Stats-abstract.pdf

accesso aperto

Descrizione: Abstract
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 548.55 kB
Formato Adobe PDF
548.55 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/399859
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact