In recent years, emerging and improved Natural Language Processing (NLP) models, such as Bidirectional Encoder Representations from Transformers (BERT), have gained significant attention due to their performance on several natural language tasks. However, inappropriate focus is usually given to the critical problems of security and data privacy, since these models require access to plain data. To address these issues, we suggest a solution based on Fully Homomorphic Encryption (FHE), which allows for computations to be performed on encrypted data. In particular, we propose a FHE-based circuit that, by implementing the smallest existent BERT model, namely BERT-Tiny, enables the extraction of encrypted sentences representations and encrypted text classifications. Considering the nature and the depth of this circuit, we used the Cheon-Kim-Kim-Song (CKKS) scheme, along with the bootstrapping operation. We also propose to use precomputations for the Layer Normalization, in order to lighten computations. The experiments, which can be replicated using our open-source code, are conducted on the Stanford Sentiment Treebank (SST-2) dataset. They show that errors introduced by precomputed Layer Normalizaion, approximate FHE operations and polynomial approximations do not produce a significant performance loss.
Rovida, L., Leporati, A. (2024). Transformer-based Language Models and Homomorphic Encryption: An Intersection with BERT-tiny. In IWSPA '24: Proceedings of the 10th ACM International Workshop on Security and Privacy Analytics (pp.3-13) [10.1145/3643651.3659893].
Transformer-based Language Models and Homomorphic Encryption: An Intersection with BERT-tiny
Rovida, L;Leporati, A
2024
Abstract
In recent years, emerging and improved Natural Language Processing (NLP) models, such as Bidirectional Encoder Representations from Transformers (BERT), have gained significant attention due to their performance on several natural language tasks. However, inappropriate focus is usually given to the critical problems of security and data privacy, since these models require access to plain data. To address these issues, we suggest a solution based on Fully Homomorphic Encryption (FHE), which allows for computations to be performed on encrypted data. In particular, we propose a FHE-based circuit that, by implementing the smallest existent BERT model, namely BERT-Tiny, enables the extraction of encrypted sentences representations and encrypted text classifications. Considering the nature and the depth of this circuit, we used the Cheon-Kim-Kim-Song (CKKS) scheme, along with the bootstrapping operation. We also propose to use precomputations for the Layer Normalization, in order to lighten computations. The experiments, which can be replicated using our open-source code, are conducted on the Stanford Sentiment Treebank (SST-2) dataset. They show that errors introduced by precomputed Layer Normalizaion, approximate FHE operations and polynomial approximations do not produce a significant performance loss.File | Dimensione | Formato | |
---|---|---|---|
Rovida-2024-IWSPA 2024-VoR.pdf
accesso aperto
Descrizione: Intervento a convegno - Proceedings
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
1.07 MB
Formato
Adobe PDF
|
1.07 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.