Aiming to reduce health misinformation in Web search, we present a system for Health Information Retrieval (HIR) that ranks documents according to both their topical relevance and correctness (i.e., factuality). The system first segments documents into passages and then employs a Large Language Model (LLM) to identify and extract claims from each passage. For each claim, we formulate corresponding SPARQL queries and execute them against a Knowledge Graph (KG) extracted from a subset of DBpedia, allowing us to estimate the correctness of claims and, hence, a correctness score for documents. Topical relevance is estimated with the BM25 algorithm, which is used to produce the initial ranking of documents. To generate the final ranking, the system combines each document’s pre-computed correctness score with its topical relevance score. While existing approaches rely on machine learning or LLMs to verify correctness, our KG-based methodology enables transparent fact-checking by grounding its assessments in structured knowledge. Our approach is empirically evaluated using three TREC Health Misinformation collections (2020–2022).

Milanese, G., Peikos, G., Pasi, G., Viviani, M. (2025). Fact-Driven Health Information Retrieval: Integrating LLMs and Knowledge Graphs to Combat Misinformation. In Advances in Information Retrieval. ECIR 2025 (pp.192-200) [10.1007/978-3-031-88714-7_17].

Fact-Driven Health Information Retrieval: Integrating LLMs and Knowledge Graphs to Combat Misinformation

Milanese, Gian Carlo;Peikos, Georgios;Pasi, Gabriella;Viviani, Marco
2025

Abstract

Aiming to reduce health misinformation in Web search, we present a system for Health Information Retrieval (HIR) that ranks documents according to both their topical relevance and correctness (i.e., factuality). The system first segments documents into passages and then employs a Large Language Model (LLM) to identify and extract claims from each passage. For each claim, we formulate corresponding SPARQL queries and execute them against a Knowledge Graph (KG) extracted from a subset of DBpedia, allowing us to estimate the correctness of claims and, hence, a correctness score for documents. Topical relevance is estimated with the BM25 algorithm, which is used to produce the initial ranking of documents. To generate the final ranking, the system combines each document’s pre-computed correctness score with its topical relevance score. While existing approaches rely on machine learning or LLMs to verify correctness, our KG-based methodology enables transparent fact-checking by grounding its assessments in structured knowledge. Our approach is empirically evaluated using three TREC Health Misinformation collections (2020–2022).
poster + paper
Fact-checking, Knowledge graphs, Large language models, Health misinformation, Health information retrieval
English
The 47th European Conference on Information Retrieval (ECIR 2025) - April 06 – April 10, 2025
2025
Hauff, C;
Advances in Information Retrieval. ECIR 2025
9783031887130
4-apr-2025
2025
15574 LNCS
192
200
https://ecir2025.eu/
reserved
Milanese, G., Peikos, G., Pasi, G., Viviani, M. (2025). Fact-Driven Health Information Retrieval: Integrating LLMs and Knowledge Graphs to Combat Misinformation. In Advances in Information Retrieval. ECIR 2025 (pp.192-200) [10.1007/978-3-031-88714-7_17].
File in questo prodotto:
File Dimensione Formato  
Milanese-2025-ECIR 2025-VoR.pdf

Solo gestori archivio

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Tutti i diritti riservati
Dimensione 1.27 MB
Formato Adobe PDF
1.27 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/548684
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact