While Dense Retrieval Models (DRMs) have advanced Information Retrieval (IR), they often suffer from limited generalizability and robustness. Various studies address these limitations with representation learning techniques that leverage the Mixture-of-Experts (MoE) architecture. Unlike prior works in IR that integrate MoE within the Transformer layers of DRMs, we add a single MoE block (SB-MoE) after the output of the final Transformer layer. Our empirical evaluation investigates how SB-MoE compares, in terms of retrieval effectiveness, to standard model fine-tuning. Given MoEs sensitivity to its hyperparameters (i.e., the number of experts), we also investigate our model’s performance under different expert configurations. Results show that SB-MoE is particularly effective for lightweight DRMs, consistently outperforming their fine-tuned counterparts. For larger DRMs, SB-MoE requires more training data to deliver improved retrieval performance. Our code is available online at: https://anonymous.4open.science/r/DenseRetrievalMoE.

Sokli, E., Kasela, P., Peikos, G., Pasi, G. (2025). Investigating Mixture of Experts in Dense Retrieval. In Proceedings of the 15th Italian Information Retrieval Workshop (IIR 2025) (pp.101-107). CEUR-WS.

Investigating Mixture of Experts in Dense Retrieval

Sokli E.;Peikos G.;Pasi G.
2025

Abstract

While Dense Retrieval Models (DRMs) have advanced Information Retrieval (IR), they often suffer from limited generalizability and robustness. Various studies address these limitations with representation learning techniques that leverage the Mixture-of-Experts (MoE) architecture. Unlike prior works in IR that integrate MoE within the Transformer layers of DRMs, we add a single MoE block (SB-MoE) after the output of the final Transformer layer. Our empirical evaluation investigates how SB-MoE compares, in terms of retrieval effectiveness, to standard model fine-tuning. Given MoEs sensitivity to its hyperparameters (i.e., the number of experts), we also investigate our model’s performance under different expert configurations. Results show that SB-MoE is particularly effective for lightweight DRMs, consistently outperforming their fine-tuned counterparts. For larger DRMs, SB-MoE requires more training data to deliver improved retrieval performance. Our code is available online at: https://anonymous.4open.science/r/DenseRetrievalMoE.
paper
Dense Neural Retrievers; Mixture-of-Experts; Representation Learning;
English
15th Italian Information Retrieval Workshop, IIR 2025 - 3 September 2025 - 5 September 2025
2025
Proceedings of the 15th Italian Information Retrieval Workshop (IIR 2025)
2025
4026
101
107
https://ceur-ws.org/Vol-4026/
open
Sokli, E., Kasela, P., Peikos, G., Pasi, G. (2025). Investigating Mixture of Experts in Dense Retrieval. In Proceedings of the 15th Italian Information Retrieval Workshop (IIR 2025) (pp.101-107). CEUR-WS.
File in questo prodotto:
File Dimensione Formato  
Sokli et al-2025-Italian Information Retrieval Workshop-CEUR-VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 402.19 kB
Formato Adobe PDF
402.19 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/582683
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact