Key-value attention mechanism for neural machine translation

H Mino, M Utiyama, E Sumita… - Proceedings of the Eighth …, 2017 - aclanthology.org
H Mino, M Utiyama, E Sumita, T Tokunaga
Proceedings of the Eighth International Joint Conference on Natural …, 2017aclanthology.org
In this paper, we propose a neural machine translation (NMT) with a key-value attention
mechanism on the source-side encoder. The key-value attention mechanism separates the
source-side content vector into two types of memory known as the key and the value. The
key is used for calculating the attention distribution, and the value is used for encoding the
context representation. Experiments on three different tasks indicate that our model
outperforms an NMT model with a conventional attention mechanism. Furthermore, we …
Abstract
In this paper, we propose a neural machine translation (NMT) with a key-value attention mechanism on the source-side encoder. The key-value attention mechanism separates the source-side content vector into two types of memory known as the key and the value. The key is used for calculating the attention distribution, and the value is used for encoding the context representation. Experiments on three different tasks indicate that our model outperforms an NMT model with a conventional attention mechanism. Furthermore, we perform experiments with a conventional NMT framework, in which a part of the initial value of a weight matrix is set to zero so that the matrix is as the same initial-state as the key-value attention mechanism. As a result, we obtain comparable results with the key-value attention mechanism without changing the network structure.
aclanthology.org
Showing the best result for this search. See all results