Large Language Models (LLMs) have achieved remarkable success in various Natural Language Processing (NLP) tasks, yet they remain prone to generating factually incorrect content, known as hallucinations. In this context, this work focuses on factuality hallucinations, offering a comprehensive review of existing detection methods and an empirical evaluation of their effectiveness. In particular, we investigate the role of external knowledge integration by testing hallucination detection approaches that leverage evidence retrieved from a real-world Web search engine. Our experimental analysis compares this knowledge-enhanced strategy with alternative approaches, including uncertainty-based and black-box methods, across multiple benchmark datasets. The results indicate that, while external knowledge generally improves factuality detection, the quality and precision of the retrieval process critically affect performance. Our findings underscore the importance of grounding LLM outputs in verifiable external sources and point to future directions for improving retrieval-augmented hallucination detection systems.
Ceccarelli, C., Raganato, A., Viviani, M. (2025). Knowledge-Grounded Detection of Factual Hallucinations in Large Language Models. In Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025). CEUR-WS.
Knowledge-Grounded Detection of Factual Hallucinations in Large Language Models
Raganato A.;Viviani M.
2025
Abstract
Large Language Models (LLMs) have achieved remarkable success in various Natural Language Processing (NLP) tasks, yet they remain prone to generating factually incorrect content, known as hallucinations. In this context, this work focuses on factuality hallucinations, offering a comprehensive review of existing detection methods and an empirical evaluation of their effectiveness. In particular, we investigate the role of external knowledge integration by testing hallucination detection approaches that leverage evidence retrieved from a real-world Web search engine. Our experimental analysis compares this knowledge-enhanced strategy with alternative approaches, including uncertainty-based and black-box methods, across multiple benchmark datasets. The results indicate that, while external knowledge generally improves factuality detection, the quality and precision of the retrieval process critically affect performance. Our findings underscore the importance of grounding LLM outputs in verifiable external sources and point to future directions for improving retrieval-augmented hallucination detection systems.| File | Dimensione | Formato | |
|---|---|---|---|
|
Ceccarelli et al-2025-CLiC-it-CEUR-VoR.pdf
accesso aperto
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
2.26 MB
Formato
Adobe PDF
|
2.26 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


