Home - Theme - Evaluation Campaign - Important Dates - Downloads - Submission - Run Submission Guidelines - Registration - Accommodations - Program - Keynote Speeches - Proceedings - Author Index - Bibliography - Venue - Gallery - Organizers - Contact - References

Run Submission Guidelines


BTEC Translation Task (BTEC_AE, BTEC_CE, BTEC_TE)

data format:

  • same format as the DEVELOP data sets.
For details, refer to the respective README files:
+ IWSLT/2009/corpus/BTEC/Arabic-English/README.BTEC_AE.txt
+ IWSLT/2009/corpus/BTEC/Chinese-English/README.BTEC_CE.txt
+ IWSLT/2009/corpus/BTEC/Turkish-English/README.BTEC_TE.txt
  • input text is case-sensitive and contains punctuations
  • English MT output should:
    • be in the same format as the input file (<SentenceID>\01\MT_output_text)
    • be case-sensitive, with punctuations
    • contain the same amount of lines (=sentences) as the input file
Example:
     TEST_IWSLT09_001\01\This is the E translation of the 1st sentence.
     TEST_IWSLT09_002\01\This is the E translation of the 2nd sentence.
     TEST_IWSLT09_003\01\
     TEST_IWSLT09_004\01\The previous input (ID=003) could not be translated, thus the translation is empty!
     TEST_IWSLT09_005\01\...
     ...
     TEST_IWSLT09_469\01\This is the E translation of the last sentence.

run submission format:
  • each participant has to translate and submit at least one translation of the given input files for each of the translation task they registered for.
  • multiple run submissions are allowed, but participants have to explicitly indicate one PRIMARY run that will be used for human assessments. All other run submissions are treated as CONTRASTIVE runs. In case that none of the runs is marked as PRIMARY, the latest submission (according to the file time-stamp) will be used for the subjective evaluation.
  • runs have to be submitted as a gzipped TAR archive (format see below) and send as an email attachement to "Michael Paul" (michael.paul@nict.go.jp).
TAR archive file structure:
<UserID>/<TranslationTask>.<UserID>.primary.txt
        /<TranslationTask>.<UserID>.contrastive1.txt
        /<TranslationTask>.<UserID>.contrastive2txt
        /...
    where: <UserID> = user ID of participant used to download data files
           <TranslationTask> = BTEC_AE | BTEC_CE | BTEC_TE

Examples:
nict/BTEC_AE.nict.primary.txt
   /BTEC_CE.nict.primary.txt
   /BTEC_CE.nict.contrastive1.txt
   /BTEC_CE.nict.contrastive2.txt
   /BTEC_CE.nict.contrastive3.txt
   /BTEC_TE.nict.primary.txt
   /BTEC_TE.nict.contrastive1.txt      
  • re-submitting your runs is allowed as far as the mails arrive BEFORE the submission deadline. In case that multiple TAR archives are submitted by the same participant, only the runs of the most recent submission mail will be used for the IWSLT 2009 evaluation and previous mails will be ignored.

CHALLENGE Translation Task (CT_CE, CT_EC)

data format:

  • same format as the DEVELOP data sets.
For details, refer to the respective README files:
+ IWSLT/2009/corpus/CHALLENGE/Chinese-English/
  README.CT_CE.txt
+ IWSLT/2009/corpus/CHALLENGE/English-Chinese/
  README.CT_EC.txt
  • the input data sets are created from the speech recognition results (ASR output) and therefore are CASE-INSENSITIVE and does NOT contain punctuations
  • the input data sets of the CHALLENGE tasks are separated according to the source language:
   + Chinese input data:
       IWSLT/2009/corpus/CHALLENGE/Chinese-English/test
   + English input data:
       IWSLT/2009/corpus/CHALLENGE/English-Chinese/test
The dialog structure is reflected in the respective sentence ID.
Example:
(dialog structure)

  IWSLT09_CT.testset_dialog01_01\01\...1st English utterance...
  IWSLT09_CT.testset_dialog01_02\01\...1st Chinese utterance...
  IWSLT09_CT.testset_dialog01_03\01\...2nd English utterance...
  IWSLT09_CT.testset_dialog01_04\01\...2nd Chinese utterance...
  IWSLT09_CT.testset_dialog01_05\01\...3rd Chinese utterance...
  IWSLT09_CT.testset_dialog01_06\01\...3rd English utterance...
  ...
(English input data to be translated into Chinese)
    + IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/TXT/
      IWSLT09_CT.testset.en.txt

      IWSLT09_CT.testset_dialog01_01\01\...1st English utterance...
      IWSLT09_CT.testset_dialog01_03\01\...2nd English utterance...
      IWSLT09_CT.testset_dialog01_06\01\...3rd English utterance...
      ...
(Chinese input data to be translated into English)
    + IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/TXT/
      IWSLT09_CT.testset.zh.txt

      IWSLT09_CT.testset_dialog01_02\01\...1st Chinese utterance...
      IWSLT09_CT.testset_dialog01_04\01\...2nd Chinese utterance...
      IWSLT09_CT.testset_dialog01_05\01\...3rd Chinese utterance...
      ...
  • English MT output should:
     + be in the same format as the Chinese input file
       (<SentenceID>\01\MT_output_text)
     + be case-sensitive, with punctuations
     + contain the same amount of lines (=sentences) as the Chinese input file
Example:
    + nict/CT_CE.nict.primary.txt
      IWSLT09_CT.testset_dialog01_02\01\...E translation of 1st Chinese utterance...
      IWSLT09_CT.testset_dialog01_04\01\...E translation of 2nd Chinese utterance...
      IWSLT09_CT.testset_dialog01_05\01\...E translation of 3rd Chinese utterance...
      ...
  • Chinese MT output should:
     + be in the same format as the English input file (<SentenceID>\01\MT_output_text)
     + be case-sensitive, with punctuations
     + contain the same amount of lines (=sentences) as the English input file
Example:
    + nict/CT_EC.nict.primary.txt
      IWSLT09_CT.testset_dialog01_01\01\...C translation of 1st English utterance...
      IWSLT09_CT.testset_dialog01_03\01\...C translation of 2nd English utterance...
      IWSLT09_CT.testset_dialog01_06\01\...C translation of 3rd English utterance...
      ...

run submission format:
  • each participant registered for the Challenge Task has to translate both translation directions (English-Chinese AND Chinese-English) and submit a total of 4 MT output files per run:
    + translations of 2 input data conditions (CRR, ASR) for Chinese-English AND
    + translations of 2 input data conditions (CRR, ASR) for English-Chinese.

(1) the correct recognition result (CRR) data files, i.e., the human transcriptions of the Challenge Task data files that do not include recognition errors:
         CE: IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/TXT/
             IWSLT09_CT.testset.zh.txt
         EC: IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/TXT/
             IWSLT09_CT.testset.en.txt

(2) the speech recognition output (ASR output, with recognition errors), whereby the participants are free to choose any of the following three ASR output data types as the input of their MT system:
        (a) word lattices:
            CE: IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/
                SLF/testset/*.zh.SLF
            EC: IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/
                SLF/testset/*.en.SLF

        (b) NBEST hypotheses:
            CE: IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/
                NBEST/IWSLT09.testset.zh.20BEST.txt
                or
                IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/
                NBEST/testset/*.zh.20BEST.txt

            EC: IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/
                NBEST/IWSLT09.testset.en.20BEST.txt
                or
                IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/
                NBEST/testset/*.en.20BEST.txt
[NOTE] larger NBEST list can be generated from the lattice data files using the following tools:
              + IWSLT/2009/corpus/CHALLENGE/Chinese-English/tools/
                extract_NBEST.zh.CT_CE.testset.sh
              + IWSLT/2009/corpus/CHALLENGE/English-Chinese/tools/
                extract_NBEST.en.CT_EC.testset.sh

        (c) 1BEST hypotheses:
            CE: IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/
                1BEST/IWSLT09.testset.zh.1BEST.txt
                or               
                IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/
                1BEST/testset/*.zh.1BEST.txt
            EC: IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/
                1BEST/IWSLT09.testset.en.1BEST.txt
                or
                IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/
                1BEST/testset/*.en.1BEST.txt
[NOTE] submissions containing only the results for one translation direction will be excluded from the subjective evaluation for IWSLT 2009.
  • multiple run submissions are allowed, but participants have to explicitly indicate one PRIMARY run that will be used for human assessments. All other run submissions are treated as CONTRASTIVE runs. In case that none of the runs is marked as PRIMARY, the latest submission (according to the file time-stamp) will be used for the subjective evaluation.
  • runs have to be submitted as a gzipped TAR archive (format see below) and send as an email attachement to "Michael Paul" (michael.paul@nict.go.jp).
TAR archive file structure:
<UserID>/CT_CE.<UserID>.primary.CRR.txt
        /CT_CE.<UserID>.primary.ASR.<CONDITION>.txt
        /CT_EC.<UserID>.primary.CRR.txt
        /CT_EC.<UserID>.primary.ASR.<CONDITION>.txt
        /...
where: <UserID> = user ID of participant used to download data files
      <CONDITION> = SLF | <NUM>
      <NUM> = number of recognition hypotheses used for translation, e.g.,
                '1'  - 1-best recognition result
                '20' - 20-best hypotheses list
Examples:
    nict/CT_CE.nict.primary.CRR.txt
        /CT_CE.nict.primary.ASR.SLF.txt
        /CT_EC.nict.primary.CRR.txt
        /CT_EC.nict.primary.ASR.SLF.txt

        /CT_CE.nict.contrastive1.CRR.txt
        /CT_CE.nict.contrastive1.ASR.1.txt
        /CT_EC.nict.contrastive1.CRR.txt
        /CT_EC.nict.contrastive1.ASR.1.txt

        /CT_CE.nict.contrastive2.CRR.txt
        /CT_CE.nict.contrastive2.ASR.20.txt
        /CT_EC.nict.contrastive2.CRR.txt
        /CT_EC.nict.contrastive2.ASR.20.txt      
  • re-submitting your runs is allowed as far as the mails arrive BEFORE the submission deadline. In case that multiple TAR archives are submitted by the same participant, only the runs of the most recent submission mail will be used for the IWSLT 2009 evaluation and previous mails will be ignored.