Theme - Evaluation Campaign - Important Dates - Downloads - Resources - Submission - Evaluation Server - Registration - Accomodation - Program - Keynote Speech - Proceedings - Author Index - Bibliography - Venue - Gallery - Organizers - Contact - References -


IWSLT 2008 Corpus
(ONLY available for participants of the IWSLT 2008 evaluation campaign)

The corpus contains the training and test data sets, the reference translations, the MT outputs, and the automatic/subjective evaluation results.

End-User License Agreement

Evaluation Corpus
CHALLENGE Chinese-English (147MB)
CHALLENGE English-Chinese (18MB)
BTEC Arabic-English (70MB)
BTEC Chinese-English (200MB)
BTEC Chinese-Spanish (174MB)
PIVOT Chinese-English-Spanish (175MB)

In order to get access to the corpus, please follow the procedure below. Access will be enabled AFTER we received your original signed license agreement.
  1. download the user license agreement, sign it, and send it two copies back to:
    Michael Paul
    National Institute of Information and Communications Technology
    Knowledge Creating Communciation Research Center
    MASTAR Project
    Language Translation Group
    2-2-2 Hikaridai, "Keihanna Science City"
    Kyoto 619-0288, Japan

  2. download the corpus files using the ID and Password you obtained for the download of the training data files for IWSLT 2008.

Other Resources

- links to external resources provided by IWSLT participants - restoring punctuation and case information in MT output

Templates for LaTeX/MSWord

- LaTeX style: iwslt08.sty
- Example document: template.tex
- Example document PS:
- Example document PDF: template.pdf
- Bibliography style: IEEEtran.bst
- MS-Word template: template.doc