Bicocca Open Archive

In this paper, we explore a multilingual translation model with a cross-lingually shared layer that can be used as fixed-size sentence representation in different downstream tasks. We systematically study the impact of the size of the shared layer and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that the performance in translation does correlate with trainable downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. On the other hand, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. We hypothesize that the training procedure on the downstream task enables the model to identify the encoded information that is useful for the specific task whereas non-trainable benchmarks can be confused by other types of information also encoded in the representation of a sentence.

Raganato, A., Vázquez, R., Creutz, M., Tiedemann, J. (2019). An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation. In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019) (pp.27-32) [10.18653/v1/W19-4304].

An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation

Raganato, Alessandro;Vázquez, Raúl;Creutz, Mathias;Tiedemann, Jörg

2019

Abstract

In this paper, we explore a multilingual translation model with a cross-lingually shared layer that can be used as fixed-size sentence representation in different downstream tasks. We systematically study the impact of the size of the shared layer and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that the performance in translation does correlate with trainable downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. On the other hand, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. We hypothesize that the training procedure on the downstream task enables the model to identify the encoded information that is useful for the specific task whereas non-trainable benchmarks can be confused by other types of information also encoded in the representation of a sentence.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				machine translation; sentence representation; inner attention
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)
			
	Anno del convegno
	
				2019
			
	Titolo degli atti
	
				Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)
			
	ISBN del volume degli atti
	
				978-1-950737-35-2
			
	Data di pubblicazione
	
				2019
			
	Pagina iniziale
	
				27
			
	Pagina finale
	
				32
			
	DOI dell'intervento
	
				https://dx.doi.org/10.18653/v1/W19-4304
			
	Fulltext
	
				reserved
			
	Citazione
	
				Raganato, A., Vázquez, R., Creutz, M., Tiedemann, J. (2019). An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation. In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019) (pp.27-32) [10.18653/v1/W19-4304].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
W19-4304.pdf Solo gestori archivio Dimensione 369.6 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	369.6 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/361571

Citazioni

8

5

Social impact