Bicocca Open Archive

While Dense Retrieval Models (DRMs) have advanced Information Retrieval (IR), they often suffer from limited generalizability and robustness. Various studies address these limitations with representation learning techniques that leverage the Mixture-of-Experts (MoE) architecture. Unlike prior works in IR that integrate MoE within the Transformer layers of DRMs, we add a single MoE block (SB-MoE) after the output of the final Transformer layer. Our empirical evaluation investigates how SB-MoE compares, in terms of retrieval effectiveness, to standard model fine-tuning. Given MoEs sensitivity to its hyperparameters (i.e., the number of experts), we also investigate our model’s performance under different expert configurations. Results show that SB-MoE is particularly effective for lightweight DRMs, consistently outperforming their fine-tuned counterparts. For larger DRMs, SB-MoE requires more training data to deliver improved retrieval performance. Our code is available online at: https://anonymous.4open.science/r/DenseRetrievalMoE.

Sokli, E., Kasela, P., Peikos, G., Pasi, G. (2025). Investigating Mixture of Experts in Dense Retrieval. In Proceedings of the 15th Italian Information Retrieval Workshop (IIR 2025) (pp.101-107). CEUR-WS.

Investigating Mixture of Experts in Dense Retrieval

Sokli E.;Kasela P.;Peikos G.;Pasi G.

2025

Abstract

While Dense Retrieval Models (DRMs) have advanced Information Retrieval (IR), they often suffer from limited generalizability and robustness. Various studies address these limitations with representation learning techniques that leverage the Mixture-of-Experts (MoE) architecture. Unlike prior works in IR that integrate MoE within the Transformer layers of DRMs, we add a single MoE block (SB-MoE) after the output of the final Transformer layer. Our empirical evaluation investigates how SB-MoE compares, in terms of retrieval effectiveness, to standard model fine-tuning. Given MoEs sensitivity to its hyperparameters (i.e., the number of experts), we also investigate our model’s performance under different expert configurations. Results show that SB-MoE is particularly effective for lightweight DRMs, consistently outperforming their fine-tuned counterparts. For larger DRMs, SB-MoE requires more training data to deliver improved retrieval performance. Our code is available online at: https://anonymous.4open.science/r/DenseRetrievalMoE.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				Dense Neural Retrievers; Mixture-of-Experts; Representation Learning;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				15th Italian Information Retrieval Workshop, IIR 2025 - 3 September 2025 - 5 September 2025
			
	Anno del convegno
	
				2025
			
	Titolo degli atti
	
				Proceedings of the 15th Italian Information Retrieval Workshop (IIR 2025)
			
	Collana o serie
	
				CEUR WORKSHOP PROCEEDINGS
			
	Data di pubblicazione
	
				2025
			
	Numero del volume
	
				4026
			
	Pagina iniziale
	
				101
			
	Pagina finale
	
				107
			
	URL alternativo
	
				https://ceur-ws.org/Vol-4026/
			
	Fulltext
	
				open
			
	Citazione
	
				Sokli, E., Kasela, P., Peikos, G., Pasi, G. (2025). Investigating Mixture of Experts in Dense Retrieval. In Proceedings of the 15th Italian Information Retrieval Workshop (IIR 2025) (pp.101-107). CEUR-WS.
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
Sokli et al-2025-Italian Information Retrieval Workshop-CEUR-VoR.pdf accesso aperto Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 402.19 kB Formato Adobe PDF Visualizza/Apri	402.19 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/582683

Citazioni

0

ND

Social impact