Home - Theme - Evaluation Campaign - Important Dates - Downloads - Submission - Registration - Accommodation - Program - Proceedings - Author Index - Photo Gallery - Local Information - Organizers - Contact - References

Theme

Spoken language translation technologies attempt to cross the language barriers between people having different native languages who each want to engage in conversation by using their mother-tongue. Spoken language translation has to deal with problems of automatic speech recognition (ASR) and machine translation (MT).

One of the prominent research activities in spoken language translation is the work being conducted by the Consortium for Speech Translation Advanced Research (C-STAR III), which is an international partnership of research laboratories engaged in automatic translation of spoken language. Current members include ATR (Japan), CAS (China), CLIPS (France), CMU (USA), ETRI (Korea), ITC-irst (Italy), and UKA (Germany). A multilingual speech corpus comprised of tourism-related sentences (BTEC*) has been created by the C-STAR members and parts of this corpus were already used for previous IWSLT workshops focusing on the evaluation of MT results based on text input (IWSLT2004) and the translation of ASR output (word lattices, N-best lists) using read speech as input (IWSLT2005). The full BTEC* corpus consists of 160K of sentence-aligned text data and parts of the corpus will be provided to all evaluation campaign participants for training purposes.

In this workshop, we focus on the translation of spontaneous speech which includes ill-formed utterances due to grammatical incorrectness, incomplete sentences, and redundant expressions. The impact of spontaneity aspects on the ASR and MT systems performance as well as the robustness of state-of-the-art MT engines towards speech recognition errors will be investigated in detail.

Two types of submissions are invited: 1) participants in the evaluation campaign of spoken language translation technologies, and 2) technical papers on related issues. Each participant in the evaluation campaign is requested to submit a paper describing the utilized ASR and MT systems and to report results using the provided test data.

An overview of the evaluation campaign is as follows:

Theme:

  • Spontaneous speech translation

Translation Directions:

  • Arabic/Chinese/Italian/Japanese into English (AE, CE, IE, JE)

Input Conditions:

  • Speech (audio)
  • ASR Output (word lattice or N-best list)
  • Cleaned Transcripts (text)

Supplied Resources:

  • training corpus:
    • AE, IE:
      • 20,000 sentence pairs of BTEC*
      • three develop sets (3x500 sentence pairs, 16 multiple references)
    • CE, JE:
      • 40,000 sentence pairs of BTEC*
      • three develop sets (3x500 sentence pairs, 16 multiple references)

  • develop corpus:
    • speech data, word lattices, N-best lists of 500 input sentences with 7 reference translations for each translation direction and input condition

  • test corpus:
    • speech data, word lattices, N-best lists of 500 input sentences for each translation direction and input condition
  → word segmentations will be provided according to the output of the provided ASR engines

Data Tracks:

    The past IWSLT workshop results showed that the amount of BTEC* sentence pairs used for training largely effects the performance of the MT systems on the given task. However, only CSTAR partners have access to the full BTEC* corpus. In order to allow a fair comparison between the systems, we decided to distinguish the following two data tracks:
  • Open Data Track ("open" for everyone :->)
    • no restrictions on training data of ASR engines
    • any resources, besides the full BTEC* corpus and proprietary data, can be used as the training data of MT engines. Concerning the BTEC* corpus and proprietary data, only the Supplied Resources (see above) are allowed to be used for training purposes.
  • C-STAR Data Track
    • no restrictions on training data of ASR engines
    • any resources (including the full BTEC* corpus and proprietary data) can be used as the training data of MT engines.

Evaluation Specification:

  • ASR output
    • (automatic) WER

  • MT output
    • (automatic) BLEU, NIST, METEOR (using 7 reference translations)
    • (subjective) fluency, adequacy
  → systems will be ranked according to the metrics underlined above
  → human assessment will be carried out for the top-10 systems (according to the BLEU metric) of the ASR Output condition, Supplied data track, Chinese-to-English translation task

Technical Paper:

The workshop also invites technical papers related to spoken language translation. Possible topics include, but are not limited to:

  • Spontaneous speech translation
  • Domain and language portability
  • MT using comparable and non-parallel corpora
  • Phrase alignment algorithms
  • MT decoding algorithms
  • MT evaluation measures