Google Scholar

Key-value attention mechanism for neural machine translation

H Mino, M Utiyama, E Sumita… - Proceedings of the Eighth …, 2017 - aclanthology.org

Proceedings of the Eighth International Joint Conference on Natural …, 2017•aclanthology.org

Abstract

In this paper, we propose a neural machine translation (NMT) with a key-value attention mechanism on the source-side encoder. The key-value attention mechanism separates the source-side content vector into two types of memory known as the key and the value. The key is used for calculating the attention distribution, and the value is used for encoding the context representation. Experiments on three different tasks indicate that our model outperforms an NMT model with a conventional attention mechanism. Furthermore, we perform experiments with a conventional NMT framework, in which a part of the initial value of a weight matrix is set to zero so that the matrix is as the same initial-state as the key-value attention mechanism. As a result, we obtain comparable results with the key-value attention mechanism without changing the network structure.

aclanthology.org

Show moreShow less

Save Cite Cited by 15 Related articles All 4 versions View as HTML

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Key-value attention mechanism for neural machine translation