During the last few years, the statistical approach has found
widespread use in machine translation of both written and spoken
language. In many comparative evaluations,
the statistical approach was found to be competitive or
superior to the existing conventional approaches.
Like other natural language processing tasks, machine translation
requires four major components:
- an error measure for the decision rule that is used to generate
the target sentence from the source sentence;
- a set of probability models that replace the true
but unknown probability distributions in the decision rule,
- a training criterion that is used
to learn the unknown model parameters from training data;
- an efficient implementation of the decision rule,
which is referred to as generation or,
like in speech recognition, as search or decoding.
We will consider each of these four components in more detail
and review the attempts that have been made to improve the
state of the art.
In addition, we will address the problem of
recognition-translation integration which is specific
of spoken language translation.
|