Unsupervised neural machine translation with cross-lingual language representation agreement

H Sun, R Wang, K Chen, M Utiyama… - … on Audio, Speech …, 2020 - ieeexplore.ieee.org
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020ieeexplore.ieee.org
Unsupervised cross-lingual language representation initialization methods such as
unsupervised bilingual word embedding (UBWE) pre-training and cross-lingual masked
language model (CMLM) pre-training, together with mechanisms such as denoising and
back-translation, have advanced unsupervised neural machine translation (UNMT), which
has achieved impressive results on several language pairs, particularly French-English and
German-English. Typically, UBWE focuses on initializing the word embedding layer in the …
Unsupervised cross-lingual language representation initialization methods such as unsupervised bilingual word embedding (UBWE) pre-training and cross-lingual masked language model (CMLM) pre-training, together with mechanisms such as denoising and back-translation, have advanced unsupervised neural machine translation (UNMT), which has achieved impressive results on several language pairs, particularly French-English and German-English. Typically, UBWE focuses on initializing the word embedding layer in the encoder and decoder of UNMT, whereas the CMLM focuses on initializing the entire encoder and decoder of UNMT. However, UBWE/CMLM training and UNMT training are independent, which makes it difficult to assess how the quality of UBWE/CMLM affects the performance of UNMT during UNMT training. In this paper, we first empirically explore relationships between UNMT and UBWE/CMLM. The empirical results demonstrate that the performance of UBWE and CMLM has a significant influence on the performance of UNMT. Motivated by this, we propose a novel UNMT structure with cross-lingual language representation agreement to capture the interaction between UBWE/CMLM and UNMT during UNMT training. Experimental results on several language pairs demonstrate that the proposed UNMT models improve significantly over the corresponding state-of-the-art UNMT baselines.
ieeexplore.ieee.org
Showing the best result for this search. See all results