Searching for online information is nowadays a critical task in a scenario characterized by information overload and misinformation. To address these issues, it is necessary to provide users with both topically relevant and truthful information. Re-ranking is a strategy often used in Information Retrieval (IR) to consider multiple dimensions of relevance. However, re-rankers often analyze the full text of documents to obtain an overall relevance score at the re-ranking stage, which can lead to sub-optimal results. Some recent Transformer-based re-rankers actually consider text passages rather than the entire document, but focus only on topical relevance. Transformers are also being used in non-IR solutions to identify information truthfulness, but just to perform a binary classification task. Therefore, in this article, we propose an IR model based on re-ranking that focuses on suitably identified text passages from documents for retrieving both topically relevant and truthful information. This approach significantly reduces the noise introduced by query-unrelated content in long documents and allows us to evaluate the document’s truthfulness against it, enabling more effective retrieval. We tested the effectiveness of the proposed solution in the context of the Consumer Health Search task, considering publicly available datasets. Our results show that the proposed approach statistically outperforms full-text retrieval models in the context of multidimensional relevance, such as those based on aggregation, and monodimensional relevance Transformer-based re-rankers, such as BERT-based re-rankers.

Upadhyay, R., Pasi, G., Viviani, M. (2023). A Passage Retrieval Transformer-Based Re-Ranking Model for Truthful Consumer Health Search. In Machine Learning and Knowledge Discovery in Databases: Research Track European Conference, ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Proceedings, Part I (pp.355-371). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-43412-9_21].

A Passage Retrieval Transformer-Based Re-Ranking Model for Truthful Consumer Health Search

Upadhyay, R;Pasi, G;Viviani, M
2023

Abstract

Searching for online information is nowadays a critical task in a scenario characterized by information overload and misinformation. To address these issues, it is necessary to provide users with both topically relevant and truthful information. Re-ranking is a strategy often used in Information Retrieval (IR) to consider multiple dimensions of relevance. However, re-rankers often analyze the full text of documents to obtain an overall relevance score at the re-ranking stage, which can lead to sub-optimal results. Some recent Transformer-based re-rankers actually consider text passages rather than the entire document, but focus only on topical relevance. Transformers are also being used in non-IR solutions to identify information truthfulness, but just to perform a binary classification task. Therefore, in this article, we propose an IR model based on re-ranking that focuses on suitably identified text passages from documents for retrieving both topically relevant and truthful information. This approach significantly reduces the noise introduced by query-unrelated content in long documents and allows us to evaluate the document’s truthfulness against it, enabling more effective retrieval. We tested the effectiveness of the proposed solution in the context of the Consumer Health Search task, considering publicly available datasets. Our results show that the proposed approach statistically outperforms full-text retrieval models in the context of multidimensional relevance, such as those based on aggregation, and monodimensional relevance Transformer-based re-rankers, such as BERT-based re-rankers.
slide + paper
Passage Retrieval; Health Misinformation; Transformers; Consumer Health Search
English
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2023 - 18 September 2023 through 22 September 2023
2023
Koutra, D; Plant, C; Rodriguez, MG; Baralis, E; Bonchi, F
Machine Learning and Knowledge Discovery in Databases: Research Track European Conference, ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Proceedings, Part I
9783031434112
2023
14169 LNCS
355
371
none
Upadhyay, R., Pasi, G., Viviani, M. (2023). A Passage Retrieval Transformer-Based Re-Ranking Model for Truthful Consumer Health Search. In Machine Learning and Knowledge Discovery in Databases: Research Track European Conference, ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Proceedings, Part I (pp.355-371). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-43412-9_21].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/440718
Citazioni
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
Social impact