Google Scholar

Translation quality estimation using only bilingual corpora

L Liu, A Fujita, M Utiyama, A Finch… - IEEE/ACM Transactions …, 2017 - ieeexplore.ieee.org

L Liu, A Fujita, M Utiyama, A Finch, E Sumita

IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2017•ieeexplore.ieee.org

In computer-aided translation scenarios, quality estimation of machine translation hypotheses plays a critical role. Existing methods for word-level translation quality estimation (TQE) rely on the availability of manually annotated TQE training data obtained via direct annotation or postediting. However, due to the cost of human labor, such data are either limited in size or is only available for few tasks in practice. To avoid the reliance on such annotated TQE data, this paper proposes an approach to train word-level TQE models using bilingual corpora, which are typically used in machine translation training and is relatively easier to access. We formalize the training of our proposed method under the framework of maximum marginal likelihood estimation. To avoid degenerated solutions, we propose a novel regularized training objective whose optimization is achieved by an efficient approximation. Extensive experiments on both written and spoken language datasets empirically show that our approach yields comparable performance to the standard training on annotated data.

ieeexplore.ieee.org

Show moreShow less

Save Cite Cited by 11 Related articles All 3 versions

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Translation quality estimation using only bilingual corpora