Bicocca Open Archive

In recent years, emerging and improved Natural Language Processing (NLP) models, such as Bidirectional Encoder Representations from Transformers (BERT), have gained significant attention due to their performance on several natural language tasks. However, inappropriate focus is usually given to the critical problems of security and data privacy, since these models require access to plain data. To address these issues, we suggest a solution based on Fully Homomorphic Encryption (FHE), which allows for computations to be performed on encrypted data. In particular, we propose a FHE-based circuit that, by implementing the smallest existent BERT model, namely BERT-Tiny, enables the extraction of encrypted sentences representations and encrypted text classifications. Considering the nature and the depth of this circuit, we used the Cheon-Kim-Kim-Song (CKKS) scheme, along with the bootstrapping operation. We also propose to use precomputations for the Layer Normalization, in order to lighten computations. The experiments, which can be replicated using our open-source code, are conducted on the Stanford Sentiment Treebank (SST-2) dataset. They show that errors introduced by precomputed Layer Normalizaion, approximate FHE operations and polynomial approximations do not produce a significant performance loss.

Rovida, L., Leporati, A. (2024). Transformer-based Language Models and Homomorphic Encryption: An Intersection with BERT-tiny. In IWSPA '24: Proceedings of the 10th ACM International Workshop on Security and Privacy Analytics (pp.3-13). Association for Computing Machinery, Inc [10.1145/3643651.3659893].

Transformer-based Language Models and Homomorphic Encryption: An Intersection with BERT-tiny

Rovida, L;Leporati, A

2024

Abstract

In recent years, emerging and improved Natural Language Processing (NLP) models, such as Bidirectional Encoder Representations from Transformers (BERT), have gained significant attention due to their performance on several natural language tasks. However, inappropriate focus is usually given to the critical problems of security and data privacy, since these models require access to plain data. To address these issues, we suggest a solution based on Fully Homomorphic Encryption (FHE), which allows for computations to be performed on encrypted data. In particular, we propose a FHE-based circuit that, by implementing the smallest existent BERT model, namely BERT-Tiny, enables the extraction of encrypted sentences representations and encrypted text classifications. Considering the nature and the depth of this circuit, we used the Cheon-Kim-Kim-Song (CKKS) scheme, along with the bootstrapping operation. We also propose to use precomputations for the Layer Normalization, in order to lighten computations. The experiments, which can be replicated using our open-source code, are conducted on the Stanford Sentiment Treebank (SST-2) dataset. They show that errors introduced by precomputed Layer Normalizaion, approximate FHE operations and polynomial approximations do not produce a significant performance loss.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				slide + paper
			
	Parole chiave
	
				homomorphic encryption; natural language processing; secure machine learning;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				10th ACM International Workshop on Security and Privacy Analytics, IWSPA 2024 - 21 June 2024
			
	Anno del convegno
	
				2024
			
	Titolo degli atti
	
				IWSPA '24: Proceedings of the 10th ACM International Workshop on Security and Privacy Analytics
			
	ISBN del volume degli atti
	
				9798400705564
			
	Data di pubblicazione
	
				2024
			
	Pagina iniziale
	
				3
			
	Pagina finale
	
				13
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1145/3643651.3659893
			
	URL alternativo
	
				https://dl.acm.org/doi/10.1145/3643651.3659893
			
	Fulltext
	
				open
			
	Citazione
	
				Rovida, L., Leporati, A. (2024). Transformer-based Language Models and Homomorphic Encryption: An Intersection with BERT-tiny. In IWSPA '24: Proceedings of the 10th ACM International Workshop on Security and Privacy Analytics (pp.3-13). Association for Computing Machinery, Inc [10.1145/3643651.3659893].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
Rovida-2024-IWSPA 2024-VoR.pdf accesso aperto Descrizione: Intervento a convegno - Proceedings Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 1.07 MB Formato Adobe PDF Visualizza/Apri	1.07 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/487120

Citazioni

0

0

Social impact