This study provides insights into both addressing data confidentiality concerns and enhancing document retrieval effectiveness in Data Marketplaces, which in this specific study consist of unstructured, textual documents. Through a semi-automatic sanitization process leveraging token masking with text summarization, possibly complemented by Coreference Resolution, the proposed solution mitigates the risk of inferring confidential information while maintaining search performance. Experimental results demonstrate encouraging improvements in both aspects with respect to baseline solutions.

Cassani, L., Livraga, G., Viviani, M. (2024). Assessing Document Sanitization for Controlled Information Release and Retrieval in Data Marketplaces. In Experimental IR Meets Multilinguality, Multimodality, and Interaction 15th International Conference of the CLEF Association, CLEF 2024, Grenoble, France, September 9–12, 2024, Proceedings, Part I (pp.88-99) [10.1007/978-3-031-71736-9_4].

Assessing Document Sanitization for Controlled Information Release and Retrieval in Data Marketplaces

Viviani, Marco
2024

Abstract

This study provides insights into both addressing data confidentiality concerns and enhancing document retrieval effectiveness in Data Marketplaces, which in this specific study consist of unstructured, textual documents. Through a semi-automatic sanitization process leveraging token masking with text summarization, possibly complemented by Coreference Resolution, the proposed solution mitigates the risk of inferring confidential information while maintaining search performance. Experimental results demonstrate encouraging improvements in both aspects with respect to baseline solutions.
slide + paper
Text Sanitization; Confidentiality; Text Summarization; Coreference Resolution; Information Retrieval; Data Marketplaces
English
15th International Conference of the CLEF Association, CLEF 2024 - September 9–12, 2024
2024
Goeuriot, L; Mulhem, P; Quénot, G; Schwab, D; Di Nunzio, GM; Soulier, L; Galuščáková, P; García Seco de Herrera, A; Faggioli, G; Ferro, N
Experimental IR Meets Multilinguality, Multimodality, and Interaction 15th International Conference of the CLEF Association, CLEF 2024, Grenoble, France, September 9–12, 2024, Proceedings, Part I
9783031717352
14-set-2024
2024
14958 LNCS
88
99
none
Cassani, L., Livraga, G., Viviani, M. (2024). Assessing Document Sanitization for Controlled Information Release and Retrieval in Data Marketplaces. In Experimental IR Meets Multilinguality, Multimodality, and Interaction 15th International Conference of the CLEF Association, CLEF 2024, Grenoble, France, September 9–12, 2024, Proceedings, Part I (pp.88-99) [10.1007/978-3-031-71736-9_4].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/514099
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact