2009年12月アーカイブ

Theme

mpaul (2009年12月 1日 10:00) | トラックバック(0)

The International Workshop on Spoken Language Translation (IWSLT) is a yearly, open evaluation campaign for spoken language translation followed by a scientific workshop, in which both system descriptions and scientific papers are presented. IWSLT's evaluations are not competition-oriented, but their goal is to foster cooperative work and scientific exchange. In this respect, IWSLT proposes challenging research tasks and an open experimental infrastructure for the scientific community working on spoken and written language translation.

Evaluation Campaign

The 6th International Workshop on Spoken Language Translation will take place in Tokyo, Japan in December 2009. The focus of this year's evaluation campaign will be the translation of task-oriented human dialogs in travel situations. The speech data was recorded through human interpreters, where native speakers of different languages were asked to complete certain travel-related tasks like hotel reservations using their mother-tongue. The translation of the freely-uttered conversation was carried-out by human interpreters. The obtained speech data was annotated with dialog and speaker information. For the Challenge Task, IWSLT participants will have to translate both, the Chinese and English, outputs of the automatic speech recognizers (lattice, N/1BEST) into English and Chinese, respectively.

Like in previous IWSLT events, a standard BTEC Task will be provided. However, the BTEC Task will focus on text input only, i.e. no automatic speech recognizer results (lattice, N/1BEST) have to be translated. In addition to the Arabic-English and Chinese-English translation tasks, this year's evaluation campaign features Turkish as a new input language.

Each participant in the evaluation campaign is requested to submit a paper describing the MT system, the utilized resources, and results using the provided test data. Contrastive run submissions using only the bilingual resources provided by IWSLT as well as investigations in the contribution of each utilized resource are highly appreciated. Moreover, all participants are requested to present their papers at the workshop.

Scientific Paper

In addition to the evaluation campaign, the IWSLT 2009 workshop also invites scientific paper submissions related to spoken language technologies. Possible topics include, but are not limited to:

Spoken dialog modeling
Integration of ASR and MT
SMT, EBMT, RBMT, Hybrid MT
MT evaluation
Language resources for MT
Open source software for MT
Pivot-language-based MT
Task adaptation and portability in MT

Evaluation Campaign

mpaul (2009年12月 1日 09:00) | トラックバック(0)

The evaluation campaign is carried out using BTEC (Basic Travel Expression Corpus), a multilingual speech corpus containing tourism-related sentences similar to those that are usually found in phrasebooks for tourists going abroad. In addition, parts of the SLDB (Spoken Language Databases) corpus, a collection of human-mediated cross-lingual dialogs in travel situations, are provided to the participants of the Challenge Task. Details about the supplied corpora, the data set conditions for each track, the guidelines on how to submit one's translation results, and the evaluation specifications used in this workshop are given below.

Please note that compared to previous IWSLT evaluation campaigns, the guidelines for how to use the language resources for each data track have changed for IWSLT 2009. Starting in 2007, we encouraged everyone to collect out-of-domain language resources and tools that could be shared between the participants. This was very helpful for many participants and allowed many interesting experiments, but had the side-effect of the system outputs being difficult to compare because it was impossible to find out whether certain gains in performance were triggered by better suited (or simply more) language resources (engineering aspects) or by improvements in the underlying decoding algorithms and statistical models (research aspects). After the IWSLT 2008 workshop, many participants asked us to focus on the research aspects for IWSLT 2009.

Therefore, the monolingual and bilingual language resources that should be used to train the translation engines for the primay runs are limited to the supplied corpus for each translation task. This includes all supplied development sets, i.e., you are free to use these data sets as you wish for tuning of model parameters or as training bitext, etc. All other languages resources besides the ones for the given translation task, should be treated as "additional language resources". For examples, any additional dictionaries, word lists, bitext corpora such as the ones provided by LDC. In addition, some participants asked whether they could use the BTEC TE and BTEC AE supplied resources for the BTEC CE task. These should also be treated as "additional resources". Because it is impossible to limit the usage of linguistic tools like word segmentation tools, parsers, etc., those tools are allowed to preprocess the supplied corpus, but we kindly ask participants to describe in detail which tools were applied for data preprocessing in their system description paper.

In order to motivate participants to continue to explore the effects of additional language resources (model adaptation, OOV handling, etc.) we DO ACCEPT contrastive runs based on additional resources. These will be evaluated automatically using the same framework as the primary runs, thus the results will be directly comparable to this year's primary runs and can be published by the participants in the MT system description paper or in a scientific paper. Due to the workshop budget limits, however, it would be difficult to include all contrastive runs into the subjective evaluation. Therefore, we kindly ask the partipants for a contribution if they would like to obtain a human assessment of their contrastive runs as well. If you intend to do so, please contact us as soon as possible, so that we can adjust the evaluation schedule accordingly. Contrastive run results will not appear in the overview paper, but participants are free to report their findings in the MT system description paper or even a separate scientific paper submission.

[Corpus Specifications]

[Translation Input Conditions]

[Evaluation Specifications]

Corpus Specifications

BTEC Training Corpus:

data format:

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\01\<MT_TRAINING_SENTENCE>

Field_1: sentence ID
Field_2: paraphrase ID
Field_3: MT training sentence

example:

TRAIN_00001\01\This is the first training sentence.
TRAIN_00002\01\This is the second training sentence.

Arabic-English (AE)
Chinese-English (CE)
Turkish-English (TE)

20K sentences randomly selected from the BTEC corpus
coding: UTF-8
text is case-sensitive and includes punctuations

BTEC Develop Corpus:

text input, reference translations of BTEC sentences
data format:

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\<PARAPHRASE_ID>\<TEXT>

Field_1: sentence ID
Field_2: paraphrase ID
Field_3: MT develop sentence / reference translation

text input example:

DEV_001\01\This is the first develop sentence.
DEV_002\01\This is the second develop sentence.

reference translation example:

DEV_001\01\1st reference translation for 1st input
DEV_001\02\2nd reference translation for 1st input
...
DEV_002\01\1st reference translation for 2nd input
DEV_002\02\2nd reference translation for 2nd input
...

Arabic-English

CSTAR03 testset: 506 sentences, 16 reference translations
IWSLT04 testset: 500 sentences, 16 reference translations
IWSLT05 testset: 506 sentences, 16 reference translations
IWSLT07 testset: 489 sentences, 6 reference translations
IWSLT08 testset: 507 sentences, 16 reference translations

Chinese-English

CSTAR03 testset: 506 sentences, 16 reference translations
IWSLT04 testset: 500 sentences, 16 reference translations
IWSLT05 testset: 506 sentences, 16 reference translations
IWSLT07 testset: 489 sentences, 6 reference translations
IWSLT08 testset: 507 sentences, 16 reference translations

Turkish-English

CSTAR03 testset: 506 sentences, 16 reference translations
IWSLT04 testset: 500 sentences, 16 reference translations

BTEC Test Corpus:

Arabic-English
Chinese-English
Turkish-English

470 unseen sentences of the BTEC evaluation corpus
coding: → see BTEC Develop Corpus
data format: → see BTEC Develop Corpus

TOP

CHALLENGE Training Corpus:

TXT data format:

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\01\<MT_TRAINING_SENTENCE>

Field_1: dialog ID
Field_2: sentence ID
Field_3: MT training sentence

example:
TRAIN_00001\This is the first training sentence.
TRAIN_00002\This is the second training sentence.
...

INFO data format:

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\01\<SPEAKER_TAG>

Field_1: dialog ID
Field_2: sentence ID
Field_3: speaker annotations ('a': agent, 'c': customer, 'i': interpreter)

example:
train_dialog01\01\a
train_dialog01\02\i
train_dialog01\03\a
...
train_dialog398\20\i
train_dialog398\21\i
train_dialog398\22\c

Chinese-English (CE)
English-Chinese (EC)

394 dialogs, 10K sentences from the SLDB corpus
coding: UTF-8
word segmentations according to ASR output segmentation
text is case-sensitive and includes punctuations

CHALLENGE Develop Corpus:

ASR output (lattice, NBEST, 1BEST), correct recognition result transcripts (text), reference translations of SLDB dialogs
data format:

1-BEST

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\01\<RECOGNITION_HYPOTHESIS>

Field_1: sentence ID
Field_2: paraphrase ID
Field_3: best recognition hypothesis
example (input):

IWSLT09_CT.devset_dialog01_02\01\best ASR hypothesis for 1st utterance
IWSLT09_CT.devset_dialog01_04\01\best ASR hypothesis for 2nd utterance
IWSLT09_CT.devset_dialog01_06\01\best ASR hypothesis for 3rd utterance
...

N-BEST

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\01\<RECOGNITION_HYPOTHESIS>

Field_1: sentence ID
Field_2: NBEST ID (max: 20)
Field_3: recognition hypothesis
example (input):

IWSLT09_CT.devset_dialog01_02\01\best ASR hypothesis for 1st utterance
IWSLT09_CT.devset_dialog01_02\02\2nd-best ASR hypothesis for 1st utterance
...
IWSLT09_CT.devset_dialog01_02\20\20th-best ASR hypothesis for 1st utterance
IWSLT09_CT.devset_dialog01_04\01\best ASR hypothesis for 2nd utterance
...

word lattices → HTK Standard Lattice Format (SLF)

reference translations

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\lt;PARAPHRASE_ID>\<REFERENCE>

Field_1: sentence ID
Field_2: paraphrase ID
Field_3: reference translation
example:

IWSLT09_CT.devset_dialog01_02\01\1st reference translation for 1st input
IWSLT09_CT.devset_dialog01_02\02\2nd reference translation for 1st input
...
IWSLT09_CT.devset_dialog01_04\01\1st reference translation for 2nd input
IWSLT09_CT.devset_dialog01_04\02\2nd reference translation for 2nd input
...

Chinese-English

IWSLT05 testset: 506 sentences, 16 reference translations (read speech)
IWSLT06 devset: 489 sentences, 16 reference translations (read speech, spontaneous speech)
IWSLT06 testset: 500 sentences, 16 reference translations (read speech, spontaneous speech)
IWSLT08 devset: 245 sentences, 7 reference translations (spontaneous speech)
IWSLT08 testset: 506 sentences, 7 reference translations (spontaneous speech)
IWSLT09 devset: 10 dialogs, 200 sentences, 4 reference translations (spontaneous speech)

English-Chinese

IWSLT05 testset: 506 sentences, 16 reference translations (read speech)
IWSLT08 devset: 245 sentences, 7 reference translations (spontaneous speech)
IWSLT08 testset: 506 sentences, 7 reference translations (spontaneous speech)
IWSLT09 devset: 10 dialogs, 210 sentences, 4 reference translations (spontaneous speech)

CHALLENGE Test Corpus:

Chinese-English

27 dialogs, 405 sentences
coding: → see CHALLENGE Develop Corpus
TXT data format: → see CHALLENGE Develop Corpus
INFO data format: → see CHALLENGE Training Corpus

English-Chinese

27 dialogs, 393 sentences
coding: → see CHALLENGE Develop Corpus
TXT data format: → see CHALLENGE Training Corpus
INFO data format: → see CHALLENGE Training Corpus

TOP

Translation Input Conditions

Spontaneous Speech

Challenge Task

Chinese-English
English-Chinese

→ ASR output (word lattice, N-best, 1-best) of ASR engines provided by IWSLT organizers

Correct Recognition Results

Challenge Task

Chinese-English
English-Chinese

BTEC Task

Arabic-English
Chinese-English
Turkish-English

→ text input

TOP

Evaluation

Subjective Evaluation:

Metrics:
- ranking
  (= official evaluation metrics to order MT system scores)
  → all primary run submissions
- fluency/adequacy
  → top-ranked primary run submission
- dialog adequacy
  (= adequacy judgments in the context of the given dialog)
  → top-ranked primary run submission
Evaluators:
- 3 graders per translation

Automatic Evaluation:

Metrics:
- BLEU/NIST (NIST v13)
- METEOR (meteor_0.8.3)
- GTM (gtm-1.4)
- TER (tercom-0.7.25)
- WER/PER
→ up to 7 reference translations
→ all run submissions
Evaluation Specifications:
- case+punc:
  - case sensitive
  - with punctuation marks tokenized
- no_case+no_punc:
  - case insensitive (lower-case only)
  - no punctuation marks
Data Processing Prior to Evaluation:
- English MT Output:
  - simple tokenization of punctuations (see 'tools/ppEnglish.case+punc.pl' script)
- Chinese MT Output:
  - segmentation into characters (see 'tools/splitUTF8Characters' script)

TOP

Important Dates

mpaul (2009年12月 1日 08:00) | トラックバック(0)

Evaluation Campaign

Event	Date
Training Corpus Release	June 19, 2009
Test Corpus Release	Aug 14, 2009
Run Submission Due	Aug 28, 2009
Result Feedback to Participants	September 11, 2009
MT System Descriptions Due	September 18, 2009
Notification of Acceptance	October 16, 2009
Camera-ready Paper Due	October 31, 2009
Workshop	December 1 - 2, 2009

Technical Papers

Event	Date
Paper Submission Due	August 21, 2009
Notification of Acceptance	October 9, 2009
Camera-ready Paper Due	October 31, 2009
Workshop	December 1 - 2, 2009

Downloads

mpaul (2009年12月 1日 06:00) | トラックバック(0)

IWSLT 2009 Corpus Release (for IWSLT 2009 participants only)

User License Agreement	DOC, PDF
Download	CHALLENGE Task Chinese ↔ English BTEC Task Arabic → English Chinese → English Turkish → English

In order to get access to the corpus, please follow the procedure below. Access will be enabled AFTER we received your original signed user license agreement.

download the post-workshop user license agreement (click on DOC/PDF link above), sign it, and send it two copies to:

Michael Paul
National Institute of Information and Communications Technology
Knowledge Creating Communciation Research Center
MASTAR Project
Language Translation Group
3-5 Hikaridai, "Keihanna Science City"
Kyoto 619-0289, Japan

download the corpus files using the ID and Password you obtained for the download of the training data files for IWSLT 2009.

Corpus Data Files

- train:

(BTEC) 20K sentence pairs of translation examples with case and punctuation information segmented according to utilized ASR engine
(CHALLENGE) in addition to BTEC@train, 10K sentence pairs of translation examples with dialog annotations

- dev:

up to 6 evaluation data sets containing 500 source language sentences with multiple references and ASR output data files (= testsets of previous IWSLT evaluation campaigns)

- test:

500 source language sentences and ASR output data files (= input of run-submissions of this years evaluation campaign)

- tools:

preprocessing scripts (tokenization, NBEST extraction, etc.) used to prepare the data sets

For data set details, click on the translation direction name tag.

CHALLENGE
Chinese-English
train	dev	test	tools
TGZ	TGZ	TGZ	TGZ
English-Chinese
train	dev	test	tools
TGZ	TGZ	TGZ	TGZ

BTEC
Arabic-to-English
train	dev	test	tools
TGZ	TGZ	TGZ	TGZ
Chinese-to-English
train	dev	test	tools
TGZ	TGZ	TGZ	TGZ
Turkish-to-English
train	dev	test	tools
TGZ	TGZ	TGZ	TGZ

Templates for LaTeX/MSWord

- Gzipped TAR archive (all template files): latex_template_iwslt09.tgz

- LaTeX style: iwslt09.sty

- Example document: template.tex

- Example document PS: template.ps

- Example document PDF: template.pdf

- Bibliography style: IEEEtran.bst

- MS-Word template: template.doc

Submission

mpaul (2009年12月 1日 05:00) | トラックバック(0)

Submissions of Technical Papers and MT System Descriptions must be done electronically in PDF format using the above links. The style-files and templates are available at the download page. Authors are strongly encouraged to use the provided LaTeX style files or MS-Word equivalents. Submissions should follow the "Paper Submission Format Guidelines" listed below.

Paper Submission Format Guidelines

The format of each paper submission (evaluation campaign and technical paper) should agree with the "Camera-Ready Paper Format Guidelines" listed below.

Camera-Ready Paper Format Guidelines

PDF file format
Maximum eight (8) pages (Standard A4 size: 210 mm by 297 mm preferred)
Single-spaced
Two (2) columns
Printed in black ink on white paper and check that the positioning (left and top margins) as well as other layout features are correct.
No smaller than nine (9) point type font throughout the paper, including figure captions.
To achieve the best viewing experience for the Proceedings, we strongly encourage to use Times-Roman font (the LaTeX style file as well as the Word template files use Times-Roman). This is needed in order to give the Proceedings a uniform look.
Do NOT include headers and footers. The page numbers and conference identification will be post processed automatically, at the time of printing the Proceedings.
The first page should have the paper title, author(s), and affiliation(s) centered on the page across both columns. The remainder of the text must be in the two-column format, staying within the indicated image area.
Follow the style of the sample paper that is included with regard to title, authors, affiliations, abstract, heading, and subheadings.

Paper Title The paper title must be in boldface. All non-function words must be capitalized, and all other words in the title must be lower case. The paper title is centered across the top of the two columns on the first page as indicated above

Authors' Name(s) The authors' name(s) and affiliation(s) appear centered below the paper title. If space permits, include a mailing address here. The templates indicate the area where the title and author information should go. These items need not be strictly confined to the number of lines indicated; papers with multiple authors and affiliations, for example, may require two or more lines for this information.

Abstract Each paper must contain an abstract that appears at the beginning of the paper.

Major Headings Major headings are in boldface, with the first word capitalized and the rest of the heading in lower case. Examples of the various levels of headings are included in the templates.

Sub Headings Sub headings appear like major headings, except they start at the left margin in the column.

Sub-Sub Headings Sub-sub headings appear like sub headings, except they are in italics and not bold face.

References Number and list all references at the end of the paper. The references are numbered in order of appearance in the document. When referring to them in the text, type the corresponding reference number in square brackets as shown at the end of this sentence [1]. (This is done automatically when using the Latex template).

Illustrations Illustrations must appear within the designated margins, and must be positioned within the paper margins. They may span the two columns. If possible, position illustrations at the top of columns, rather than in the middle or at the bottom. Caption and number every illustration. All half-tone or color illustrations must be clear when printed in black and white.

Templates

If your paper will be typeset using LaTeX, please download the template package here that will generate the proper format. To extract files under UNIX run: $ unzip latex_template_iwslt09.tgz

LaTeX style: iwslt09.sty
Example document: template.tex
Example document PS: template.ps
Example document PDF: template.pdf
Bibliography style: IEEEtran.bst
MS-Word template: template.doc

Paper Status

After submission, each paper will be given a unique Paper ID and a password. This will be shown on the confirmation page right after submission of the documents and a confirmation email including the Paper ID will be sent to the author of the paper as well. It will be possible to check and correct (if necessary) the submitted paper information (names, affiliations, etc.). Corrections/uploads can be made up to the respective submission deadline.

Paper Acceptance/Rejection Information

Each corresponding author will be notified by e-mail of acceptance/rejection. Reviewer feedback will also be available for each paper.

Run Submission Guidelines

mpaul (2009年12月 1日 04:45) | トラックバック(0)

BTEC Translation Task (BTEC_AE, BTEC_CE, BTEC_TE)

data format:

same format as the DEVELOP data sets.

For details, refer to the respective README files:
+ IWSLT/2009/corpus/BTEC/Arabic-English/README.BTEC_AE.txt
+ IWSLT/2009/corpus/BTEC/Chinese-English/README.BTEC_CE.txt
+ IWSLT/2009/corpus/BTEC/Turkish-English/README.BTEC_TE.txt

input text is case-sensitive and contains punctuations

English MT output should:

be in the same format as the input file (<SentenceID>\01\MT_output_text)
be case-sensitive, with punctuations
contain the same amount of lines (=sentences) as the input file

Example:

     TEST_IWSLT09_001\01\This is the E translation of the 1st sentence.
     TEST_IWSLT09_002\01\This is the E translation of the 2nd sentence.
     TEST_IWSLT09_003\01\
     TEST_IWSLT09_004\01\The previous input (ID=003) could not be translated, thus the translation is empty!
     TEST_IWSLT09_005\01\...
     ...
     TEST_IWSLT09_469\01\This is the E translation of the last sentence.

run submission format:

each participant has to translate and submit at least one translation of the given input files for each of the translation task they registered for.

multiple run submissions are allowed, but participants have to explicitly indicate one PRIMARY run that will be used for human assessments. All other run submissions are treated as CONTRASTIVE runs. In case that none of the runs is marked as PRIMARY, the latest submission (according to the file time-stamp) will be used for the subjective evaluation.

runs have to be submitted as a gzipped TAR archive (format see below) and send as an email attachement to "Michael Paul" (michael.paul@nict.go.jp).

TAR archive file structure:
<UserID>/<TranslationTask>.<UserID>.primary.txt
        /<TranslationTask>.<UserID>.contrastive1.txt
        /<TranslationTask>.<UserID>.contrastive2txt
        /...

where: <UserID> = user ID of participant used to download data files
<TranslationTask> = BTEC_AE | BTEC_CE | BTEC_TE

Examples:
nict/BTEC_AE.nict.primary.txt
   /BTEC_CE.nict.primary.txt
   /BTEC_CE.nict.contrastive1.txt
   /BTEC_CE.nict.contrastive2.txt
   /BTEC_CE.nict.contrastive3.txt
   /BTEC_TE.nict.primary.txt
   /BTEC_TE.nict.contrastive1.txt

re-submitting your runs is allowed as far as the mails arrive BEFORE the submission deadline. In case that multiple TAR archives are submitted by the same participant, only the runs of the most recent submission mail will be used for the IWSLT 2009 evaluation and previous mails will be ignored.

CHALLENGE Translation Task (CT_CE, CT_EC)

data format:

same format as the DEVELOP data sets.

For details, refer to the respective README files:
+ IWSLT/2009/corpus/CHALLENGE/Chinese-English/
README.CT_CE.txt
+ IWSLT/2009/corpus/CHALLENGE/English-Chinese/
README.CT_EC.txt

the input data sets are created from the speech recognition results (ASR output) and therefore are CASE-INSENSITIVE and does NOT contain punctuations

the input data sets of the CHALLENGE tasks are separated according to the source language:

   + Chinese input data:
       IWSLT/2009/corpus/CHALLENGE/Chinese-English/test
   + English input data:
       IWSLT/2009/corpus/CHALLENGE/English-Chinese/test

The dialog structure is reflected in the respective sentence ID.

Example:
(dialog structure)
IWSLT09_CT.testset_dialog01_01\01\...1st English utterance...
IWSLT09_CT.testset_dialog01_02\01\...1st Chinese utterance...
IWSLT09_CT.testset_dialog01_03\01\...2nd English utterance...
IWSLT09_CT.testset_dialog01_04\01\...2nd Chinese utterance...
IWSLT09_CT.testset_dialog01_05\01\...3rd Chinese utterance...
IWSLT09_CT.testset_dialog01_06\01\...3rd English utterance...
...

(English input data to be translated into Chinese)

    + IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/TXT/
      IWSLT09_CT.testset.en.txt

      IWSLT09_CT.testset_dialog01_01\01\...1st English utterance...
      IWSLT09_CT.testset_dialog01_03\01\...2nd English utterance...
      IWSLT09_CT.testset_dialog01_06\01\...3rd English utterance...
      ...

(Chinese input data to be translated into English)

    + IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/TXT/
      IWSLT09_CT.testset.zh.txt

      IWSLT09_CT.testset_dialog01_02\01\...1st Chinese utterance...
      IWSLT09_CT.testset_dialog01_04\01\...2nd Chinese utterance...
      IWSLT09_CT.testset_dialog01_05\01\...3rd Chinese utterance...
      ...

English MT output should:

     + be in the same format as the Chinese input file
       (<SentenceID>\01\MT_output_text)
     + be case-sensitive, with punctuations
     + contain the same amount of lines (=sentences) as the Chinese input file

Example:

    + nict/CT_CE.nict.primary.txt
      IWSLT09_CT.testset_dialog01_02\01\...E translation of 1st Chinese utterance...
      IWSLT09_CT.testset_dialog01_04\01\...E translation of 2nd Chinese utterance...
      IWSLT09_CT.testset_dialog01_05\01\...E translation of 3rd Chinese utterance...
      ...

Chinese MT output should:

     + be in the same format as the English input file (<SentenceID>\01\MT_output_text)
     + be case-sensitive, with punctuations
     + contain the same amount of lines (=sentences) as the English input file

Example:

    + nict/CT_EC.nict.primary.txt
      IWSLT09_CT.testset_dialog01_01\01\...C translation of 1st English utterance...
      IWSLT09_CT.testset_dialog01_03\01\...C translation of 2nd English utterance...
      IWSLT09_CT.testset_dialog01_06\01\...C translation of 3rd English utterance...
      ...

run submission format:

each participant registered for the Challenge Task has to translate both translation directions (English-Chinese AND Chinese-English) and submit a total of 4 MT output files per run:

+ translations of 2 input data conditions (CRR, ASR) for Chinese-English AND
+ translations of 2 input data conditions (CRR, ASR) for English-Chinese.

(1) the correct recognition result (CRR) data files, i.e., the human transcriptions of the Challenge Task data files that do not include recognition errors:

         CE: IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/TXT/
             IWSLT09_CT.testset.zh.txt
         EC: IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/TXT/
             IWSLT09_CT.testset.en.txt

(2) the speech recognition output (ASR output, with recognition errors), whereby the participants are free to choose any of the following three ASR output data types as the input of their MT system:

        (a) word lattices:
            CE: IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/
                SLF/testset/*.zh.SLF
            EC: IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/
                SLF/testset/*.en.SLF

        (b) NBEST hypotheses:
            CE: IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/
                NBEST/IWSLT09.testset.zh.20BEST.txt
                or
                IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/
                NBEST/testset/*.zh.20BEST.txt

            EC: IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/
                NBEST/IWSLT09.testset.en.20BEST.txt
                or
            IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/
                NBEST/testset/*.en.20BEST.txt

[NOTE] larger NBEST list can be generated from the lattice data files using the following tools:

              + IWSLT/2009/corpus/CHALLENGE/Chinese-English/tools/
                extract_NBEST.zh.CT_CE.testset.sh
              + IWSLT/2009/corpus/CHALLENGE/English-Chinese/tools/
                extract_NBEST.en.CT_EC.testset.sh

        (c) 1BEST hypotheses:
            CE: IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/
                1BEST/IWSLT09.testset.zh.1BEST.txt
                or
                IWSLT/2009/corpus/CHALLENGE/Chinese-English/test/
                1BEST/testset/*.zh.1BEST.txt
            EC: IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/
                1BEST/IWSLT09.testset.en.1BEST.txt
                or
                IWSLT/2009/corpus/CHALLENGE/English-Chinese/test/
                1BEST/testset/*.en.1BEST.txt

[NOTE] submissions containing only the results for one translation direction will be excluded from the subjective evaluation for IWSLT 2009.

multiple run submissions are allowed, but participants have to explicitly indicate one PRIMARY run that will be used for human assessments. All other run submissions are treated as CONTRASTIVE runs. In case that none of the runs is marked as PRIMARY, the latest submission (according to the file time-stamp) will be used for the subjective evaluation.

runs have to be submitted as a gzipped TAR archive (format see below) and send as an email attachement to "Michael Paul" (michael.paul@nict.go.jp).

TAR archive file structure:
<UserID>/CT_CE.<UserID>.primary.CRR.txt
        /CT_CE.<UserID>.primary.ASR.<CONDITION>.txt
        /CT_EC.<UserID>.primary.CRR.txt
        /CT_EC.<UserID>.primary.ASR.<CONDITION>.txt
        /...

where: <UserID> = user ID of participant used to download data files
      <CONDITION> = SLF | <NUM>
      <NUM> = number of recognition hypotheses used for translation, e.g.,
              '1' - 1-best recognition result
                '20' - 20-best hypotheses list

Examples:

    nict/CT_CE.nict.primary.CRR.txt
        /CT_CE.nict.primary.ASR.SLF.txt
        /CT_EC.nict.primary.CRR.txt
        /CT_EC.nict.primary.ASR.SLF.txt

        /CT_CE.nict.contrastive1.CRR.txt
        /CT_CE.nict.contrastive1.ASR.1.txt
        /CT_EC.nict.contrastive1.CRR.txt
        /CT_EC.nict.contrastive1.ASR.1.txt

        /CT_CE.nict.contrastive2.CRR.txt
        /CT_CE.nict.contrastive2.ASR.20.txt
        /CT_EC.nict.contrastive2.CRR.txt
        /CT_EC.nict.contrastive2.ASR.20.txt

re-submitting your runs is allowed as far as the mails arrive BEFORE the submission deadline. In case that multiple TAR archives are submitted by the same participant, only the runs of the most recent submission mail will be used for the IWSLT 2009 evaluation and previous mails will be ignored.

We prepared an online evaluation server that allows you to conduct additional experiments to confirm the effectiveness of innovative methods and features within the IWSLT 2009 evaluation framework. You can submit translation hypotyhesis files for any of the IWSLT 2009 translation tasks. The hypothesis file format is the same as for the official run submissions.

Before you can submit runs, you have to register a UserID/PassID. After login, click on "Make a new Submission", select the "Translation Direction" and "Training Data Condition" you used to generate the hypothesis file, upload the hypothesis file, specify a system ID and a short description that allows you to easily identify the run submission, and press "Calculate Scores".

The server will sequentially calculate automatic scores for BLEU/NIST, WER/PER/TER, and METEOR/F1/PREC/RECL and GTM. Finally, the automatic scoring results will be send to you via email. In addition, you can access the "Submission Log" which keeps track on all your run submissions. For details on a specific run, please click on the respective "Date". The scoring results of the "case+punc" evaluation specifications (case-sensitive, with punctuations) are displayed in bold-face and the scoring results of the "no_case+no_punc" evaluation specifications (case-insensitive, without punctuations) are displayed in brackets.

Registration

mpaul (2009年12月 1日 04:30) | トラックバック(0)

Workshop

The registration for the IWSLT 2009 workshop is now open. Please access the registration server and fill out the registration form.

Registration	Fees		Deadline	Payment
Registration	Regular	Student	Deadline	Payment
Early	JPY 20,000	JPY 15,000	Nov 20, 2009	online
Late	JPY 25,000	JPY 20,000	Nov 30, 2009	on-the-door (registration desk)
On-site	JPY 30,000	JPY 25,000	Dec 1 - 2, 2009	on-the-door (registration desk)

The registration fees include: daily lunch, coffee breaks, USB-stick version of the proceedings, participation in all session, and a banquet dinner on December 1. Please note, that the registration fee is not refundable under any circumstances.

Online payment is only possible until the extended Early Registration Deadline. During the Late Registration period, you have to input the registration form online, but the payment has to be done on the door at the workshop registration desk (7F Miraikan).

Concerning on-the-door payments, only cash payments can be accepted.

If you need a visa for coming to Japan, please contact the IWSLT Secretariat (iwslt@the-convention.co.jp) as soon as possible, but not later than October 2nd, 2009.

If you don't know whether your need a visa or not, please check here.

Accommodations

mpaul (2009年12月 1日 04:25) | トラックバック(0)

Hotel rates differ from day to day.
Please check-out the links below or contact the hotel directly.

	ホテルグランパシフィック LE DAIBA (Grand Pacific Le Daiba) [map] [reservations]
	「〒135-8701 東京都港区台場2-6-1」
	Daiba 2-6-1, Daiba, Minato-ku, Tokyo 135-8701, Japan
	(Tel) +81-3-5500-6711 (Fax) +81-3-5500-4507
	http://www.grandpacific.jp/eng (English) http://www.grandpacific.jp (Japanese)
	19,000 JPY〜 (1 person, 1 night)
	(U07) Daiba station [map] ←1min,180JPY→ (U08) Fune-no-kagakukan station [venue]

	三井ガーデンホテル汐留イタリア街 (Mitsui Garden Hotel Shiodome Italia-gai) [map] [reservations]
	「〒105-0021 東京都港区東新橋2-14-24」
	2-14-24, Higashi-shimbashi, Minato-ku, Tokyo 105-0021, Japan
	(Tel) +81-3-3431-1131 (Fax) +81-3-3431-2431
	http://www.gardenhotels.co.jp/eng/shiodome.html (English) http://www.gardenhotels.co.jp/shiodome/index.html (Japanese)
	10,300 JPY〜 (1 person, 1 night)
	(U02) Shiodome station [map] ←15min,310JPY→ (U08) Fune-no-kagakukan station [venue]

	ホテルヴィラフォンテーヌ汐留 (Hotel Villa Fontaine Shiodome) [map] [reservations]
	「〒105-0021 東京都港区東新橋1-9-2」
	1-9-2 Higashi-shinbashi Minato-ku, Tokyo 105-0021, Japan
	(Tel) +81-3-3569-2220 (Fax) +81-3-3569-2111
	http://www.hvf.jp/eng/shiodome.php (English) http://www.hvf.jp/shiodome (Japanese) http://www.hvf.jp/chi/shiodome.html (Chinese)
	10,000 JPY〜 (1 person, 1 night)
	(U02) Shiodome station [map] ←15min,310JPY→ (U08) Fune-no-kagakukan station [venue]

	ホテル日航東京 (Hotel Nikko Tokyo) [map] [reservations]
	「〒135-8625 東京都港区台場1丁目9番1号」
	1-9-1 Daiba, Minato-ku, Tokyo 135-8625, Japan
	(Tel) +81-3-5500-5500 (Fax) +81-3-5500-2525
	http://www.hnt.co.jp/en/index.html (English) http://www.hnt.co.jp/ (Japanese) http://www.jalhotels.com/cn/domestic/kanto/index.html#tokyo (Chinese)
	9,500 JPY〜 (1 person, 1 night)
	(U07) Daiba station [map] ←1min,180JPY→ (U08) Fune-no-kagakukan station [venue]

	ホテルトラスティ東京ベイサイド (Hotel Trusty Tokyo Bayside) [map] [reservations]
	「〒135-0063 東京都江東区有明3-1-5」
	3-1-5 Ariake, Koto-ki, Tokyo 135-0063, Japan
	(Tel) +81-3-6700-0001 (Fax) +81-3-6700-0007
	http://www.trusty.jp/tokyobayside/pdf/tokyobayside_e.pdf (English) http://www.trusty.jp/tokyobayside (Japanese)
	6,700 JPY〜 (1 person, 1 night)
	(U11) Kokusai-tenjijou-seimon station [map] ←6min,240JPY→ (U08) Fune-no-kagakukan station [venue]

	ホテルサンルート有明 (Hotel Sunroute Ariake) [map] [reservations]
	「〒135-0063 東京都江東区有明3-1-20」
	3-1-20 Ariake, Koutou-Ku, Tokyo 135-0063, Japan
	(Tel) +81-3-5530-3610
	http://www.sunroutehotel.jp/hari-eng/index.asp (English) http://www.sunroutehotel.jp/ariake/ (Japanese) http://www.sunroutehotel.jp/hari-chi/index.asp (Chinese)
	6,500 JPY〜 (1 person, 1 night)
	(U11) Kokusai-tenjijou-seimon station [map] ←6min,240JPY→ (U08) Fune-no-kagakukan station [venue]

	東京ベイ有明ワシントンホテル (Tokyo Bay Ariake Washington Hotel) [map] [reservations]
	「〒135-0063 東京都江東区有明3-1-28」
	3-1-28 Ariake, Koto-ku, Tokyo, Japan 135-0063, Japan
	(Tel) +81-3-5564-0111 (Fax) +81-3-5564-0525
	http://www.wh-rsv.com/english/tokyo_bay_ariake (English) http://www.wh-rsv.com/wh/hotels/ariake/index.html (Japanese) http://www.wh-rsv.com/chinese/tokyo_bay_ariake (Chinese)
	6,000 JPY〜 (1 person, 1 night)
	(U12) Ariake station [map] ←7min,240JPY→ (U08) Fune-no-kagakukan station [venue]

Program

mpaul (2009年12月 1日 04:20) | トラックバック(0)

December 1, 2009

09:00

09:30

workshop registration

*Workshop Opening*
09:30	09:40	Welcome Remarks
		Satoshi NAKAMURA (NICT, Japan)
*Evaluation Campaign:* "Overview Talk"
09:40	10:10	Overview of the IWSLT 2009 Evaluation Campaign
		Michael PAUL (NICT, Japan)

coffee break

*Evaluation Campaign:* "Challenge Task"
10:30	11:00	Two methods for stabilizing MERT: NICT at IWSLT 2009
		Masao UTIYAMA, Hirofumi YAMAMOTO, Eiichiro SUMITA (NICT, Japan)
11:00	11:30	Low-Resource Machine Translation Using MaTrEx: The DCU Machine Translation System for IWSLT 2009
		Yanjun MA, Tsuyoshi OKITA, Özlem ÇETINOGLU, Jinhua DU, Andy WAY (Dublin City University, Ireland)
11:30	12:00	The CASIA Statistical Machine Translation System for IWSLT 2009
		Maoxi LI, Jiajun ZHANG , Yu ZHOU, Chengqing ZONG (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences; China)

lunch break

*Invited Talk*
13:00	14:00	Human Translation and Machine Translation
		Philipp KOEHN (University of Edinburgh, UK)
*Technical Paper:* "Oral I"
14:00	14:30	Morphological Pre-Processing for Turkish to English Statistical Machine Translation
		Arianna BISAZZA, Marcello FEDERICO (FBK-irst, Italy)
14:30	15:00	Enriching SCFG Rules Directly From Efficient Bilingual Chart Parsing
		Martin CMEJREK, Bowen ZHOU, Bing XIANG (IBM, USA)
15:00	15:30	A Unified Framework for Phrase-Based, Hierarchical, and Syntax-Based Statistical Machine Translation
		Hieu HOANG, Philipp KOEHN, Adam LOPEZ (Univ. Edinburgh, UK)

coffee break

*Evaluation Campaign:* "Poster I"
15:50	16:50	The TÜBITAK-UEKAE Statistical Machine Translation System for IWSLT 2009
		Coskun MERMER, Hamza KAYA, Mehmet Ugur DOGAN (TÜBITAK-UEKAE, Turkey)
15:50	16:50	The UOT System: Improve String-to-Tree Translation Using Head-Driven Phrasal Structure Grammar and Predicate-Argument Structures
		Xianchao WU, Takuya MATSUZAKI, Naoaki OKAZAKI, Yusuke MIYAO, Jun'ichi TSUJII (University of Tokyo, Japan)
15:50	16:50	The GREYC Translation Memory for the IWSLT2009 Evaluation Campaign: one step beyond translation memory
		Yves LEPAGE, Adrien LARDILLEUX, Julien GOSME (University of Caen, France)
15:50	16:50	The ICT Statistical Machine Translation Systems for the IWSLT 2009
		Haitao MI, Yang LIU, Tian XIA, Xinyan XIAO, Yang FENG, Jun XIE, Hao XIONG, Zhaopeng TU, Daqi ZHENG, Yanjuan LU, Qun LIU (Institute of Computing Technology, Chinese Academy of Sciences; China)
15:50	16:50	The University of Washington Machine Translation System for IWSLT 2009
		Mei YANG, Amittai AXELROD, Kevin DUH, Katrin KIRCHHOFF (University of Washington, USA)
15:50	16:50	Statistical Machine Translation adding Pattern based Machine translation in Chinese-English Translation
		Jin'ichi MURAKAMI, Masato TOKUHISA, Satoru IKEHARA (Tottori University, Japan)
*Demo Session*
16:50	17:20	Network-based Speech-to-Speech Translation
		Chiori HORI, Sakriani SAKTI, Michael PAUL, Satoshi NAKAMURA (NICT, Japan)
*Banquet*
18:00	20:00	Restaurant "LA TERRE" (Miraikan, 7F)

December 2, 2009

09:00

09:30

workshop registration

*Invited Talk*
09:30	10:30	Two-way Speech-to-Speech Translation for Communicating Across Language Barriers
		Premkumar NATARAJAN (BBN Technologies, USA)

coffee break

*Technical Paper:* "Oral II"
10:50	11:20	Structural Support Vector Machines for Log-Linear Approach in Statistical Machine Translation
		Katsuhiko HAYASHI (University of Doshisha, Japan); Taro WATANABE, Hajime TSUKADA and Hideki ISOZAKI (NTT, Japan)
11:20	11:50	Online Language Model Adaptation for Spoken Dialog Translation
		Germán SANCHIS-TRILLES (Universitat Politècnica de València, Spain); Mauro CETTOLO, Nicola BERTOLDI, Marcello FEDERICO (FBK-irst, Italy)

lunch break

*Invited Talk*
13:00	14:00	Monolingual Knowledge Acquisition and a Multilingual Information Environment
		Kentaro TORISAWA (NICT, Japan)
*Evaluation Campaign:* "Poster II"
14:00	15:00	AppTek Turkish-English Machine Translation System Description for IWSLT 2009
		Selçuk KÖPRÜ (Apptek Inc., Turkey)
14:00	15:00	LIG approach for IWSLT09 : Using Multiple Morphological Segmenters for Spoken Language Translation of Arabic
		Fethi BOUGARES, Laurent BESACIER, Hervé BLANCHON (LIG, France)
14:00	15:00	Barcelona Media SMT system description for the IWSLT 2009: introducing source context information
		Marta R. COSTA-JUSSA, Rafael E. BANCHS (Barcelona Media, Spain)
14:00	15:00	FBK @ IWSLT-2009
		Nicola BERTOLDI, Arianna BISAZZA, Mauro CETTOLO, Marcello FEDERICO (FBK-irst, Italy); Germán SANCHIS-TRILLES (Universitat Politècnica de València, Spain)
14:00	15:00	LIUM's Statistical Machine Translation Systems for IWSLT 2009
		Holger SCHWENK, Loïc BARRAULT, Yannick ESTÈVE, Patrik LAMBER (University of le Mans, France)
14:00	15:00	I²R's Machine Translation System for IWSLT 2009
		Xiangyu DUAN, Deyi XIONG, Hui ZHANG, Min ZHANG, Haizhou LI (Institute for Infocomm Research, Singapore)

coffee break

*Evaluation Campaign:* "BTEC Task"
15:20	15:50	The NUS Statistical Machine Translation System for IWSLT 2009
		Preslav NAKOV, Chang LIU, Wei LU, Hwee Tou NG (National University of Singapore, Singapore)
15:50	16:20	The UPV Translation System for IWSLT 2009
		Guillem GASCÓ, Joan Andreu SÁNCHEZ (Universitat Politècnica de València, Spain)
16:20	16:50	The MIT-LL/AFRL System for IWSLT 2009
		Wade SHEN, Brian DELANEY, Arya Ryan AMINZADEH (MIT Lincoln Laboratory, USA); Timothy ANDERSON, Raymond SLYH (Air Force Research Laboratory, USA)
*Workshop Closing*
16:50	17:00	Closing Remarks
		Marcello FEDERICO (FBK-irst, Italy)

Keynote Speeches

mpaul (2009年12月 1日 04:15) | トラックバック(0)

Keynote Speech 1

Human Translation and Machine Translation
	Philipp KOEHN (University of Edinburgh, UK)
	While most of recent machine translation work has focus on the gisting application (i.e., translating web pages), another important application is to aid human translators. To build better computer aided translation tools, we first need to understand how human translators work. We discuss how human translators work and what tools they typically use. We also build a novel tool that offers post-editing, interactive sentence completion, and display of translation options (online at www.caitra.org). We collected timing logs on interactions with the tool, which allows detailed analysis of translator behavior.

Keynote Speech 2

Two-way Speech-to-Speech Translation for Communicating Across Language Barriers
	Premkumar NATARAJAN (BBN Technologies, USA)
	Two-way speech-to-speech (S2S) translation is a spoken language application that integrates multiple technologies including speech recognition, machine translation, text-to-speech synthesis, and dialog management. In recent years, research into S2S systems has resulted in several modeling techniques for improving coverage on broad domains and rapid configuration for new language pairs or domains. This talk will highlight recent advances in S2S area that range from improvements in component technologies to improvements in the end-to-end system for mobile use. I will also present metrics for evaluating the S2S technology, a methodology for determining the impact of different causes of errors, and future directions for research and development.

Keynote Speech 3

Monolingual Knowledge Acquisition and a Multilingual Information Environment
	Kentaro TORISAWA (NICT, Japan)
	Large-scale knowledge acquisition from the Web has been a popular research topic in the last five years. This talk gives an overview of our current project aiming at the acquisition of a large scale semantic network from the Web, and in the talk I explore its possible interaction with machine translation research. Particularly, I would like to focus on two topics; multilingual corpora as source of knowledge and the applications of machine translation enabled by our technology. I will discuss a framework of bilingual co-training that gives a marked improvement in accuracy of the acquired knowledge by using two corpora written in two different languages. Also, I will show our technology can enable a new type of tasks for machine translation in Web applications.

Proceedings

mpaul (2009年12月 1日 04:12) | トラックバック(0)

- Author Index -

Evaluation Campaign
pp.1-18	paper	slides	bib	Overview of the IWSLT 2009 Evaluation Campaign
				Michael PAUL
pp.19-23	paper	(not yet)	bib	apptek
				AppTek Turkish-English Machine Translation System Description for IWSLT 2009
				Selçuk KÖPRÜ
pp.24-28	paper	poster	bib	bmrc
				Barcelona Media SMT system description for the IWSLT 2009: introducing source context information
				Marta R. COSTA-JUSSA, Rafael E. BANCHS
pp.29-36	paper	slides	bib	dcu
				Low-Resource Machine Translation Using MaTrEx: The DCU Machine Translation System for IWSLT 2009
				Yanjun MA, Tsuyoshi OKITA, Özlem ÇETINOGLU, Jinhua DU, Andy WAY
pp.37-44	paper	poster	bib	fbk
				FBK @ IWSLT-2009
				Nicola BERTOLDI, Arianna BISAZZA, Mauro CETTOLO, Marcello FEDERICO (FBK-irst, Italy); Germán SANCHIS-TRILLES (Universitat Politècnica de València, Spain)
pp.45-49	paper	poster	bib	greyc
				The GREYC Translation Memory for the IWSLT 2009 Evaluation Campaign: one step beyond translation memory
				Yves LEPAGE, Adrien LARDILLEUX, Julien GOSME
pp.50-54	paper	poster	bib	i2r
				I²R's Machine Translation System for IWSLT 2009
				Xiangyu DUAN, Deyi XIONG, Hui ZHANG, Min ZHANG, Haizhou LI
pp.55-59	paper	poster	bib	ict
				The ICT Statistical Machine Translation Systems for the IWSLT 2009
				Haitao MI, Yang LIU, Tian XIA, Xinyan XIAO, Yang FENG, Jun XIE, Hao XIONG, Zhaopeng TU, Daqi ZHENG, Yajuan LU, Qun LIU
pp.60-64	paper	poster	bib	lig
				LIG approach for IWSLT09 : Using Multiple Morphological Segmenters for Spoken Language Translation of Arabic
				Fethi BOUGARES, Laurent BESACIER, Hervé BLANCHON (LIG, France)
pp.65-70	paper	poster	bib	lium
				LIUM's Statistical Machine Translation Systems for IWSLT 2009
				Holger SCHWENK, Loïc BARRAULT, Yannick ESTÈVE, Patrik LAMBERT
pp.71-78	paper	slides	bib	mit
				The MIT-LL/AFRL IWSLT-2009 System
				Wade SHEN, Brian DELANEY, Arya Ryan AMINZADEH (MIT Lincoln Laboratory, USA); Timothy ANDERSON, Raymond SLYH (Air Force Research Laboratory)
pp.79-82	paper	slides	bib	nict
				Two methods for stabilizing MERT: NICT at IWSLT 2009
				Masao UTIYAMA, Hirofumi YAMAMOTO, Eiichiro SUMITA
pp.83-90	paper	slides	bib	nlpr
				The CASIA Statistical Machine Translation System for IWSLT 2009
				Maoxi LI, Jiajun ZHANG, Yu ZHOU, Chengqing ZONG
pp.91-98	paper	slides	bib	nus
				The NUS Statistical Machine Translation System for IWSLT 2009
				Preslav NAKOV, Chang LIU, Wei LU, Hwee Tou NG
pp.99-106	paper	poster	bib	tokyo
				The UOT System: Improve String-to-Tree Translation Using Head-Driven Phrasal Structure Grammar and Predicate-Argument Structures
				Xianchao WU, Takuya MATSUZAKI, Naoaki OKAZAKI, Yusuke MIYAO, Jun'ichi TSUJII
pp.107-112	paper	poster	bib	tottori
				Statistical Machine Translation adding Pattern-based Machine Translation in Chinese-English Translation
				Jin'ichi MURAKAMI, Masato TOKUHISA, Satoru IKEHARA
pp.113-117	paper	poster	bib	tubitak
				The TÜBITAK-UEKAE Statistical Machine Translation System for IWSLT 2009
				Coskun MERMER, Hamza KAYA, Mehmet Ugur DOGAN
pp.118-123	paper	slides	bib	upv
				UPV Translation System for IWSLT 2009
				Guillem GASCÓ, Joan Andreu SÁNCHEZ
pp.124-128	paper	poster	bib	uw
				The University of Washington Machine Translation System for IWSLT 2009
				Mei YANG, Amittai AXELROD, Kevin DUH, Katrin KIRCHHOFF

Technical Paper
pp.129-135	paper	slides	bib	Morphological Pre-Processing for Turkish to English Statistical Machine Translation
				Arianna BISAZZA, Marcello FEDERICO
pp.136-143	paper	slides	bib	Enriching SCFG Rules Directly From Efficient Bilingual Chart Parsing
				Martin CMEJREK, Bowen ZHOU, Bing XIANG
pp.144-151	paper	slides	bib	Structural Support Vector Machines for Log-Linear Approach in Statistical Machine Translation
				Katsuhiko HAYASHI, Taro WATANABE, Hajime TSUKADA, Hideki ISOZAKI
pp.152-159	paper	slides	bib	A Unified Framework for Phrase-Based, Hierarchical, and Syntax-Based Statistical Machine Translation
				Hieu HOANG, Philipp KOEHN, Adam LOPEZ
pp.160-167	paper	slides	bib	Online Language Model Adaptation for Spoken Dialog Translation
				German SANCHIS-TRILLES, Mauro CETTOLO, Nicola BERTOLDI, Marcello FEDERICO

Demo
pp.168-168	paper	slides	bib	Network-based Speech-to-Speech Translation
				Chiori HORI, Sakriani SAKTI, Michael PAUL, Noriyuki KIMURA, Yutaka ASHIKARI, Ryosuke ISOTANI, Eiichiro SUMITA, Satoshi NAKAMURA (NICT, Japan)

Keynote Speech
-	abstract	slides	-	Human Translation and Machine Translation
				Philipp KOEHN (University of Edinburgh, UK)
-	abstract	(not yet)	-	Two-way Speech-to-Speech Translation for Communicating Across Language Barriers
				Premkumar NATARAJAN (BBN Technologies, USA)
-	abstract	slides	-	Monolingual Knowledge Acquisition and a Multilingual Information Environment
				Kentaro TORISAWA (NICT, Japan)

Author Index

mpaul (2009年12月 1日 04:10) | トラックバック(0)

A-B-C-D-E-F-G-H-I-K-L-M-N-O-P-S-T-U-W-X-Y-Z

A
AMINZADEH, Arya Ryan	71
ANDERSON, Timothy	71
ASHIKARI, Yutaka	168
AXELROD, Amittai	124
B
BANCHS, Rafael E.	24
BARRAULT, Loïc	65
BERTOLDI, Nicola	37, 160
BESACIER, Laurent	60
BISAZZA, Arianna	37, 129
BLANCHON, Hervé	60
BOUGARES, Fethi	60
C
ÇETINOGLU, Özlem	29
CETTOLO, Mauro	37, 160
CMEJREK, Martin	136
COSTA-JUSSÀ, Marta R.	24
D
DELANEY, Brian	71
DOGAN, Mehmet Ugur	113
DU, Jinhua	29
DUAN, Xiangyu	50
DUH, Kevin	124
E
ESTÈVE, Yannick	65
F
FEDERICO, Marcello	37, 129, 160
FENG, Yang	55
G
GASCÓ, Guillem	118
GOSME, Julian	45
H
HAYASHI, Katsuhiko	144
HOANG, Hieu	152
HORI, Chiori	168
I
IKEHARA, Satoru	107
ISOTANI, Ryosuke	168
ISOZAKI, Hideki	144
K
KAYA, Hamza	113
KIMURA, Noriyuki	168
KIRCHHOFF, Katrin	124
KOEHN, Philipp	152
KÖPRÜ, Selçuk	19
L
LAMBERT, Patrik	65
LARDILLEUX, Adrien	45
LEPAGE, Yves	45
LI, Haizhou	50
LI, Maoxi	83
LIU, Chang	91
LIU, Qun	55
LIU, Yang	55
LOPEZ, Adam	152
LU, Wei	91
LU, Yajuan	55
M
MA, Yanjun	29
MATSUZAKI, Takuya	99
MERMER, Coskun	113
MI, Haitao	55
MIYAO, Yusuke	99
MURAKAMI, Jin'ichi	107
N
NAKAMURA, Satoshi	168
NAKOV, Preslav	91
NG, Hwee Tou	91
O
OKAZAKI, Naoaki	99
OKITA, Tsuyoshi	29
P
PAUL, Michael	1, 168
S
SAKTI, Sakriani	168
SÁNCHEZ, Joan Andreu	118
SANCHIS-TRILLES, Germán	37, 160
SCHWENK, Holger	65
SHEN, Wade	71
SLYH, Raymond	71
SUMITA, Eiichiro	79, 168
T
TOKUHISA, Masato	107
TSUJII, Jun'ichi	99
TSUKADA, Hajime	144
TU, Zhaopeng	55
U
UTIYAMA, Masao	79
W
WATANABE, Taro	144
WAY, Andy	29
WU, Xianchao	99
X
XIA, Tian	55
XIANG, Bing	136
XIAO, Xinyan	55
XIE, Jun	55
XIONG, Deyi	50
XIONG, Hao	55
Y
YAMAMOTO, Hirofumi	79
YANG, Mei	124
Z
ZHANG, Hui	50
ZHANG, Jiajun	83
ZHANG, Min	50
ZHENG, Daqi	55
ZHOU, Bowen	136
ZHOU, Yu	83
ZONG, Chengqing	83

Bibliography

mpaul (2009年12月 1日 04:08) | トラックバック(0)

@inproceedings{iwslt09:EC:overview,
author=	{Michael Paul},
title=	{{Overview of the IWSLT 2009 Evaluation Campaign}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{1-18},
}

@inproceedings{iwslt09:EC:apptek,
author=	{Sel\c{o}uk K\"{o}pr\"{u}},
title=	{{AppTek Turkish-English Machine Translation System Description for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{19-23},
}

@inproceedings{iwslt09:EC:bmrc,
author=	{Marta R. Costa-Juss\`{a} and Rafael E. Banchs},
title=	{{Barcelona Media SMT system description for the IWSLT 2009: introducing source context information}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{24-28},
}

@inproceedings{iwslt09:EC:dcu,
author=	{Yanjun Ma and Tsuyoshi Okita and \"{O}zlem \c{C}etino\u{g}lu and Jinhua Du and Andy Way},
title=	{{Low-Resource Machine Translation Using MaTrEx: The DCU Machine Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{29-36},
}

@inproceedings{iwslt09:EC:fbk,
author=	{Nicola Bertoldi and Arianna Bisazza and Mauro Cettolo and Germ\'{a};n Sanchis-Trilles and Marcello Federico},
title=	{{FBK @ IWSLT-2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{37-44},
}

@inproceedings{iwslt09:EC:greyc,
author=	{Yves Lepage and Adrien Lardilleux and Julien Gosme},
title=	{{The GREYC Translation Memory for the IWSLT 2009 Evaluation Campaign: one step beyond translation memory}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{45-49},
}

@inproceedings{iwslt09:EC:i2r,
author=	{Xiangyu Duan and Deyi Xiong and Hui Zhang and Min Zhang and Haizhou Li},
title=	{{I${}^{2}$R's Machine Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{50-54},
}

@inproceedings{iwslt09:EC:ict,
author=	{Haitao Mi and Yang Liu and Tian Xia and Xinyan Xiao and Yang Feng and Jun Xie and Hao Xiong and Zhaopeng Tu and Daqi Zheng and Yajuan Lu and Qun Liu},
title=	{{The ICT Statistical Machine Translation Systems for the IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{55-59},
}

@inproceedings{iwslt09:EC:lig,
author=	{Fethi Bougares and Laurent Besacier and Herv\'{e} Blanchon},
title=	{{LIG approach for IWSLT09 : Using Multiple Morphological Segmenters for Spoken Language Translation of Arabic}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{60-64},
}

@inproceedings{iwslt09:EC:lium,
author=	{Holger Schwenk and Lo\"{i}c Barrault and Yannick Est\`{e}ve and Patrik Lambert},
title=	{{LIUM's Statistical Machine Translation Systems for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{65-70},
}

@inproceedings{iwslt09:EC:mit,
author=	{Wade Shen and Brian Delaney and Arya Ryan Aminzadeh and Timothy Anderson and Raymond Slyh},
title=	{{The MIT-LL/AFRL IWSLT-2009 System}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{71-78},
}

@inproceedings{iwslt09:EC:nict,
author=	{Masao Utiyama and Hirofumi Yamamoto and Eiichiro Sumita},
title=	{{Two methods for stabilizing MERT: NICT at IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{79-82},
}

@inproceedings{iwslt09:EC:nlpr,
author=	{Maoxi Li and Jiajun Zhang and Yu Zhou and Chengqing Zong},
title=	{{The CASIA Statistical Machine Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{83-90},
}

@inproceedings{iwslt09:EC:nus,
author=	{Preslav Nakov and Chang Liu and Wei Lu and Hwee Tou Ng},
title=	{{The NUS Statistical Machine Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{91-98},
}

@inproceedings{iwslt09:EC:tokyo,
author=	{Xianchao Wu and Takuya Matsuzaki and Naoaki Okazaki and Yusuke Miyao and Jun'ichi Tsujii},
title=	{{The UOT System: Improve String-to-Tree Translation Using Head-Driven Phrasal Structure Grammar and Predicate-Argument Structures}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{99-106},
}

@inproceedings{iwslt09:EC:tottori,
author=	{Jin'ichi Murakami and Masato Tokuhisa and Satoru Ikehara},
title=	{{Statistical Machine Translation adding Pattern-based Machine translation in Chinese-English Translation}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{107-112},
}

@inproceedings{iwslt09:EC:tubitak,
author=	{{Co\c{s}kun} Mermer and Hamza Kaya and Mehmet U\v{g}ur Do\v{g}an},
title=	{{The TUBITAK-UEKAE Statistical Machine Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{113-117},
}

@inproceedings{iwslt09:EC:upv,
author=	{Guillem Gasc\'{o} and Joan Andreu S\'{a}nchez},
title=	{{UPV Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{118-123},
}

@inproceedings{iwslt09:EC:uw,
author=	{Mei Yang and Amittai Axelrod and Kevin Duh and Katrin Kirchhoff},
title=	{{The University of Washington Machine Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{124-128},
}

@inproceedings{iwslt09:TP:bisazza,
author=	{Arianna Bisazza and Marcello Federico},
title=	{{Morphological Pre-Processing for Turkish to English Statistical Machine Translation}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{129-135},
}

@inproceedings{iwslt09:TP:cmejrek,
author=	{Martin Cmejrek and Bowen Zhou and Bing Xiang},
title=	{{Enriching SCFG Rules Directly From Efficient Bilingual Chart Parsing}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{136-143},
}

@inproceedings{iwslt09:TP:hayashi,
author=	{Katsuhiko Hayashi and Taro Watanabe and Hajime Tsukada and Hideki Isozaki},
title=	{{Structural Support Vector Machines for Log-Linear Approach in Statistical Machine Translation}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{144-151},
}

@inproceedings{iwslt09:TP:hoang,
author=	{Hieu Hoang and Philipp Koehn and Adam Lopez},
title=	{{A Unified Framework for Phrase-Based, Hierarchical, and Syntax-Based Statistical Machine Translation}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{152-159},
}

@inproceedings{iwslt09:TP:sanchis,
author=	{Germ\'{a}n Sanchis-Trilles and Mauro Cettolo and Nicola Bertoldi and Marcello Federico},
title=	{{Online Language Model Adaptation for Spoken Dialog Translation}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{160-167},
}

@inproceedings{iwslt09:DEMO:nict,
author=	{Chiori Hori and Sakriani Sakti and Michael Paul and Noriyuki Kimura and Yutaka Ashikari and Ryosuke Isotani and Eiichiro Sumita and Satoshi Nakamura},
title=	{{Network-based Speech-to-Speech Translation}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{168},
}

Venue

mpaul (2009年12月 1日 04:00) | トラックバック(0)

National Museum of Emerging Science and Innovation

2-41, Aomi, Koto-ku, Tokyo, Japan
(Tel) +81-3-3570-9151 (Fax) +81-3-3570-9150
http://www.miraikan.jst.go.jp/en

entrance @ 1F ( floor map)
workshop @ 7F ( floor map)

Access

Tourism Info

Gallery

mpaul (2009年12月 1日 03:30) | トラックバック(0)

IWSLT 2009
December 1-2, 2009
National Museum of Emerging Science and Innovation
Tokyo, Japan

	December 1

	December 2

Organizers

mpaul (2009年12月 1日 03:00) | トラックバック(0)

Organizers

Alex Waibel (CMU, USA / UKA, Germany)
Marcello Federico (FBK, Italy)
Satoshi Nakamura (NICT, Japan)

Chairs

Eiichiro Sumita (NICT, Japan; Workshop)
Michael Paul (NICT, Japan; Evaluation Campaign)
Marcello Federico (FBK, Italy; Technical Paper)

Program Committee

Laurent Besacier (LIG, France)
Francisco Casacuberta (ITI-UPV, Spain)
Boxing Chen (NRC, Canda)
Philipp Koehn (Univ. Edinburgh, UK)
Philippe Langlais (Univ. Montreal, Canada)
Geunbae Lee (Postech, Korea)
Yves Lepage (GREYC, France)
Haizhou Li (I2R, Singapore)
Qun Liu (ICT, China)
José B. Mariño (TALP-UPC, Spain)
Coskun Mermer (TUBITAK, Turkey)
Christof Monz (QMUL, UK)
Hermann Ney (RWTH, Germany)
Holger Schwenk (LIUM, France)
Wade Shen (MIT-LL, USA)
Hajime Tsukada (NTT, Japan)
Haifeng Wang (TOSHIBA, China)
Andy Way (DCU, Ireland)
Chengqing Zong (CASIA, China)

Local Arrangements

Mari Oku (NICT, Japan)

Supporting Organizations

	National Institute of Information and Communications Technology
	The Scientific and Technological Research Council of Turkey (TUBITAK) National Research Institute of Electronics and Cryptology (UEKAE)

Contact

mpaul (2009年12月 1日 02:00) | トラックバック(0)

WORKSHOP ORGANIZATION
Eiichiro Sumita
(reverse) email: jp *dot* co *dot* nict *at* sumita *dot* eiichiro

EVALUATION CAMPAIGN
Michael Paul
(reverse) email: jp *dot* go *dot* nict *at* paul *dot* michael

TECHNICAL PAPER
Marcello Federico
(reverse) email: eu *dot* fbk *at* federico

LOCAL ARRANGEMENT
Mari Oku
(reverse) email: jp *dot* go *dot* nict *dot* khn *at* iwsltlocal09

National Institute of Information and Communications Technology (NICT)
Knowledge Creating Communciation Research Center
MASTAR Project

2-2-2 Hikaridai, Keihanna Science City, Kyoto 619-0288, Japan
TEL: +81-774-95-1301
FAX: +81-774-95-1308

References

mpaul (2009年12月 1日 01:00) | トラックバック(0)

Events Co-located with IWSLT 2009

IUCS 2009 conference

IWSLT Evaluation Campaigns

IWSLT 2010, http://iwslt2010.fbk.eu
IWSLT 2009, IWSLT2009
IWSLT 2008, IWSLT2008
IWSLT 2007, http://iwslt07.itc.it
IWSLT 2006, IWSLT2006
IWSLT 2005, http://www.is.cs.cmu.edu/iwslt2005
IWSLT 2004, IWSLT2004
C-STAR, http://www.c-star.org/

アーカイブ

IWSLT2009: 2009年12月アーカイブ

2009年12月 1日

Theme

Evaluation Campaign

Scientific Paper

In addition to the evaluation campaign, the IWSLT 2009 workshop also invites scientific paper submissions related to spoken language technologies. Possible topics include, but are not limited to:

Spoken dialog modeling
Integration of ASR and MT
SMT, EBMT, RBMT, Hybrid MT
MT evaluation
Language resources for MT
Open source software for MT
Pivot-language-based MT
Task adaptation and portability in MT

投稿者 mpaul : 10:00| トラックバック

Evaluation Campaign

[Corpus Specifications]

[Translation Input Conditions]

[Evaluation Specifications]

Corpus Specifications

BTEC Training Corpus:

data format:

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\01\<MT_TRAINING_SENTENCE>

Field_1: sentence ID
Field_2: paraphrase ID
Field_3: MT training sentence

example:

TRAIN_00001\01\This is the first training sentence.
TRAIN_00002\01\This is the second training sentence.

Arabic-English (AE)
Chinese-English (CE)
Turkish-English (TE)

20K sentences randomly selected from the BTEC corpus
coding: UTF-8
text is case-sensitive and includes punctuations

BTEC Develop Corpus:

text input, reference translations of BTEC sentences
data format:

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\<PARAPHRASE_ID>\<TEXT>

Field_1: sentence ID
Field_2: paraphrase ID
Field_3: MT develop sentence / reference translation

text input example:

DEV_001\01\This is the first develop sentence.
DEV_002\01\This is the second develop sentence.

reference translation example:

Arabic-English

CSTAR03 testset: 506 sentences, 16 reference translations
IWSLT04 testset: 500 sentences, 16 reference translations
IWSLT05 testset: 506 sentences, 16 reference translations
IWSLT07 testset: 489 sentences, 6 reference translations
IWSLT08 testset: 507 sentences, 16 reference translations

Chinese-English

CSTAR03 testset: 506 sentences, 16 reference translations
IWSLT04 testset: 500 sentences, 16 reference translations
IWSLT05 testset: 506 sentences, 16 reference translations
IWSLT07 testset: 489 sentences, 6 reference translations
IWSLT08 testset: 507 sentences, 16 reference translations

Turkish-English

CSTAR03 testset: 506 sentences, 16 reference translations
IWSLT04 testset: 500 sentences, 16 reference translations

BTEC Test Corpus:

Arabic-English
Chinese-English
Turkish-English

470 unseen sentences of the BTEC evaluation corpus
coding: → see BTEC Develop Corpus
data format: → see BTEC Develop Corpus

TOP

CHALLENGE Training Corpus:

TXT data format:

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\01\<MT_TRAINING_SENTENCE>

Field_1: dialog ID
Field_2: sentence ID
Field_3: MT training sentence

example:
TRAIN_00001\This is the first training sentence.
TRAIN_00002\This is the second training sentence.
...

INFO data format:

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\01\<SPEAKER_TAG>

Field_1: dialog ID
Field_2: sentence ID
Field_3: speaker annotations ('a': agent, 'c': customer, 'i': interpreter)

example:
train_dialog01\01\a
train_dialog01\02\i
train_dialog01\03\a
...
train_dialog398\20\i
train_dialog398\21\i
train_dialog398\22\c

Chinese-English (CE)
English-Chinese (EC)

394 dialogs, 10K sentences from the SLDB corpus
coding: UTF-8
word segmentations according to ASR output segmentation
text is case-sensitive and includes punctuations

CHALLENGE Develop Corpus:

ASR output (lattice, NBEST, 1BEST), correct recognition result transcripts (text), reference translations of SLDB dialogs
data format:

1-BEST

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\01\<RECOGNITION_HYPOTHESIS>

Field_1: sentence ID
Field_2: paraphrase ID
Field_3: best recognition hypothesis
example (input):

N-BEST

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\01\<RECOGNITION_HYPOTHESIS>

Field_1: sentence ID
Field_2: NBEST ID (max: 20)
Field_3: recognition hypothesis
example (input):

word lattices → HTK Standard Lattice Format (SLF)

reference translations

each line consists of three fields divided by the character '\'
sentence consisting of words divided by single spaces

format: <SENTENCE_ID>\lt;PARAPHRASE_ID>\<REFERENCE>

Field_1: sentence ID
Field_2: paraphrase ID
Field_3: reference translation
example:

Chinese-English

IWSLT05 testset: 506 sentences, 16 reference translations (read speech)
IWSLT06 devset: 489 sentences, 16 reference translations (read speech, spontaneous speech)
IWSLT06 testset: 500 sentences, 16 reference translations (read speech, spontaneous speech)
IWSLT08 devset: 245 sentences, 7 reference translations (spontaneous speech)
IWSLT08 testset: 506 sentences, 7 reference translations (spontaneous speech)
IWSLT09 devset: 10 dialogs, 200 sentences, 4 reference translations (spontaneous speech)

English-Chinese

IWSLT05 testset: 506 sentences, 16 reference translations (read speech)
IWSLT08 devset: 245 sentences, 7 reference translations (spontaneous speech)
IWSLT08 testset: 506 sentences, 7 reference translations (spontaneous speech)
IWSLT09 devset: 10 dialogs, 210 sentences, 4 reference translations (spontaneous speech)

CHALLENGE Test Corpus:

Chinese-English

27 dialogs, 405 sentences
coding: → see CHALLENGE Develop Corpus
TXT data format: → see CHALLENGE Develop Corpus
INFO data format: → see CHALLENGE Training Corpus

English-Chinese

27 dialogs, 393 sentences
coding: → see CHALLENGE Develop Corpus
TXT data format: → see CHALLENGE Training Corpus
INFO data format: → see CHALLENGE Training Corpus

TOP

Translation Input Conditions

Spontaneous Speech

Challenge Task

Chinese-English
English-Chinese

→ ASR output (word lattice, N-best, 1-best) of ASR engines provided by IWSLT organizers

Correct Recognition Results

Challenge Task

Chinese-English
English-Chinese

BTEC Task

Arabic-English
Chinese-English
Turkish-English

→ text input

TOP

Evaluation

Subjective Evaluation:

Metrics:
- ranking
  (= official evaluation metrics to order MT system scores)
  → all primary run submissions
- fluency/adequacy
  → top-ranked primary run submission
- dialog adequacy
  (= adequacy judgments in the context of the given dialog)
  → top-ranked primary run submission
Evaluators:
- 3 graders per translation

Automatic Evaluation:

Metrics:
- BLEU/NIST (NIST v13)
- METEOR (meteor_0.8.3)
- GTM (gtm-1.4)
- TER (tercom-0.7.25)
- WER/PER
→ up to 7 reference translations
→ all run submissions
Evaluation Specifications:
- case+punc:
  - case sensitive
  - with punctuation marks tokenized
- no_case+no_punc:
  - case insensitive (lower-case only)
  - no punctuation marks
Data Processing Prior to Evaluation:
- English MT Output:
  - simple tokenization of punctuations (see 'tools/ppEnglish.case+punc.pl' script)
- Chinese MT Output:
  - segmentation into characters (see 'tools/splitUTF8Characters' script)

TOP

投稿者 mpaul : 09:00| トラックバック

Important Dates

Evaluation Campaign

Event	Date
Training Corpus Release	June 19, 2009
Test Corpus Release	Aug 14, 2009
Run Submission Due	Aug 28, 2009
Result Feedback to Participants	September 11, 2009
MT System Descriptions Due	September 18, 2009
Notification of Acceptance	October 16, 2009
Camera-ready Paper Due	October 31, 2009
Workshop	December 1 - 2, 2009

Technical Papers

Event	Date
Paper Submission Due	August 21, 2009
Notification of Acceptance	October 9, 2009
Camera-ready Paper Due	October 31, 2009
Workshop	December 1 - 2, 2009

投稿者 mpaul : 08:00| トラックバック

Downloads

IWSLT 2009 Corpus Release

User License Agreement	DOC, PDF
Download	CHALLENGE Task Chinese ↔ English BTEC Task Arabic → English Chinese → English Turkish → English

In order to get access to the corpus, please follow the procedure below. Access will be enabled AFTER we received your original signed user license agreement.

download the post-workshop user license agreement (click on DOC/PDF link above), sign it, and send it two copies to:

download the corpus files using the ID and Password you obtained for the download of the training data files for IWSLT 2009.

Corpus Data Files

train:

(BTEC) 20K sentence pairs of translation examples with case and punctuation information segmented according to utilized ASR engine
(CHALLENGE) in addition to BTEC@train, 10K sentence pairs of translation examples with dialog annotations

dev:

up to 6 evaluation data sets containing 500 source language sentences with multiple references and ASR output data files (= testsets of previous IWSLT evaluation campaigns)

test:

500 source language sentences and ASR output data files (= input of run-submissions of this years evaluation campaign)

tools:

preprocessing scripts (tokenization, NBEST extraction, etc.) used to prepare the data sets

For data set details, click on the translation direction name tag.

CHALLENGE
Chinese-English
train	dev	test	tools
TGZ	TGZ	TGZ	TGZ
English-Chinese
train	dev	test	tools
TGZ	TGZ	TGZ	TGZ

BTEC
Arabic-to-English
train	dev	test	tools
TGZ	TGZ	TGZ	TGZ
Chinese-to-English
train	dev	test	tools
TGZ	TGZ	TGZ	TGZ
Turkish-to-English
train	dev	test	tools
TGZ	TGZ	TGZ	TGZ

Templates for LaTeX/MSWord

- Gzipped TAR archive (all template files): latex_template_iwslt09.tgz

- LaTeX style: iwslt09.sty

- Example document: template.tex

- Example document PS: template.ps

- Example document PDF: template.pdf

- Bibliography style: IEEEtran.bst

- MS-Word template: template.doc

投稿者 mpaul : 06:00| トラックバック

Submission

Paper Submission Format Guidelines

Camera-Ready Paper Format Guidelines

PDF file format
Maximum eight (8) pages (Standard A4 size: 210 mm by 297 mm preferred)
Single-spaced
Two (2) columns
Printed in black ink on white paper and check that the positioning (left and top margins) as well as other layout features are correct.
No smaller than nine (9) point type font throughout the paper, including figure captions.
To achieve the best viewing experience for the Proceedings, we strongly encourage to use Times-Roman font (the LaTeX style file as well as the Word template files use Times-Roman). This is needed in order to give the Proceedings a uniform look.
Do NOT include headers and footers. The page numbers and conference identification will be post processed automatically, at the time of printing the Proceedings.
The first page should have the paper title, author(s), and affiliation(s) centered on the page across both columns. The remainder of the text must be in the two-column format, staying within the indicated image area.
Follow the style of the sample paper that is included with regard to title, authors, affiliations, abstract, heading, and subheadings.

Abstract Each paper must contain an abstract that appears at the beginning of the paper.

Sub Headings Sub headings appear like major headings, except they start at the left margin in the column.

Sub-Sub Headings Sub-sub headings appear like sub headings, except they are in italics and not bold face.

Templates

here

LaTeX style: iwslt09.sty
Example document: template.tex
Example document PS: template.ps
Example document PDF: template.pdf
Bibliography style: IEEEtran.bst
MS-Word template: template.doc

Paper Status

submission deadline

Paper Acceptance/Rejection Information

投稿者 mpaul : 05:00| トラックバック

Run Submission Guidelines

BTEC Translation Task (BTEC_AE, BTEC_CE, BTEC_TE)

data format:

same format as the DEVELOP data sets.

For details, refer to the respective README files:
+ IWSLT/2009/corpus/BTEC/Arabic-English/README.BTEC_AE.txt
+ IWSLT/2009/corpus/BTEC/Chinese-English/README.BTEC_CE.txt
+ IWSLT/2009/corpus/BTEC/Turkish-English/README.BTEC_TE.txt

input text is case-sensitive and contains punctuations

English MT output should:

be in the same format as the input file (<SentenceID>\01\MT_output_text)
be case-sensitive, with punctuations
contain the same amount of lines (=sentences) as the input file

Example:

run submission format:

each participant has to translate and submit at least one translation of the given input files for each of the translation task they registered for.

multiple run submissions are allowed, but participants have to explicitly indicate one PRIMARY run that will be used for human assessments. All other run submissions are treated as CONTRASTIVE runs. In case that none of the runs is marked as PRIMARY, the latest submission (according to the file time-stamp) will be used for the subjective evaluation.

runs have to be submitted as a gzipped TAR archive (format see below) and send as an email attachement to "Michael Paul" (michael.paul@nict.go.jp).

TAR archive file structure:
<UserID>/<TranslationTask>.<UserID>.primary.txt
        /<TranslationTask>.<UserID>.contrastive1.txt
        /<TranslationTask>.<UserID>.contrastive2txt
        /...

Examples:
nict/BTEC_AE.nict.primary.txt
   /BTEC_CE.nict.primary.txt
   /BTEC_CE.nict.contrastive1.txt
   /BTEC_CE.nict.contrastive2.txt
   /BTEC_CE.nict.contrastive3.txt
   /BTEC_TE.nict.primary.txt
   /BTEC_TE.nict.contrastive1.txt

re-submitting your runs is allowed as far as the mails arrive BEFORE the submission deadline. In case that multiple TAR archives are submitted by the same participant, only the runs of the most recent submission mail will be used for the IWSLT 2009 evaluation and previous mails will be ignored.

CHALLENGE Translation Task (CT_CE, CT_EC)

data format:

same format as the DEVELOP data sets.

For details, refer to the respective README files:
+ IWSLT/2009/corpus/CHALLENGE/Chinese-English/
README.CT_CE.txt
+ IWSLT/2009/corpus/CHALLENGE/English-Chinese/
README.CT_EC.txt

the input data sets are created from the speech recognition results (ASR output) and therefore are CASE-INSENSITIVE and does NOT contain punctuations

the input data sets of the CHALLENGE tasks are separated according to the source language:

The dialog structure is reflected in the respective sentence ID.

Example:
(dialog structure)
IWSLT09_CT.testset_dialog01_01\01\...1st English utterance...
IWSLT09_CT.testset_dialog01_02\01\...1st Chinese utterance...
IWSLT09_CT.testset_dialog01_03\01\...2nd English utterance...
IWSLT09_CT.testset_dialog01_04\01\...2nd Chinese utterance...
IWSLT09_CT.testset_dialog01_05\01\...3rd Chinese utterance...
IWSLT09_CT.testset_dialog01_06\01\...3rd English utterance...
...

(English input data to be translated into Chinese)

(Chinese input data to be translated into English)

English MT output should:

Example:

Chinese MT output should:

Example:

run submission format:

each participant registered for the Challenge Task has to translate both translation directions (English-Chinese AND Chinese-English) and submit a total of 4 MT output files per run:

(1) the correct recognition result (CRR) data files, i.e., the human transcriptions of the Challenge Task data files that do not include recognition errors:

(2) the speech recognition output (ASR output, with recognition errors), whereby the participants are free to choose any of the following three ASR output data types as the input of their MT system:

[NOTE] larger NBEST list can be generated from the lattice data files using the following tools:

[NOTE] submissions containing only the results for one translation direction will be excluded from the subjective evaluation for IWSLT 2009.

multiple run submissions are allowed, but participants have to explicitly indicate one PRIMARY run that will be used for human assessments. All other run submissions are treated as CONTRASTIVE runs. In case that none of the runs is marked as PRIMARY, the latest submission (according to the file time-stamp) will be used for the subjective evaluation.

runs have to be submitted as a gzipped TAR archive (format see below) and send as an email attachement to "Michael Paul" (michael.paul@nict.go.jp).

TAR archive file structure:
<UserID>/CT_CE.<UserID>.primary.CRR.txt
        /CT_CE.<UserID>.primary.ASR.<CONDITION>.txt
        /CT_EC.<UserID>.primary.CRR.txt
        /CT_EC.<UserID>.primary.ASR.<CONDITION>.txt
        /...

where: <UserID> = user ID of participant used to download data files
      <CONDITION> = SLF | <NUM>
      <NUM> = number of recognition hypotheses used for translation, e.g.,
              '1' - 1-best recognition result
                '20' - 20-best hypotheses list

Examples:

re-submitting your runs is allowed as far as the mails arrive BEFORE the submission deadline. In case that multiple TAR archives are submitted by the same participant, only the runs of the most recent submission mail will be used for the IWSLT 2009 evaluation and previous mails will be ignored.

投稿者 mpaul : 04:45| トラックバック

Automatic Evaluation Server

hypothesis file format

Registration

Workshop

The registration for the IWSLT 2009 workshop is now open. Please access the registration server and fill out the registration form.

Registration	Fees		Deadline	Payment
Registration	Regular	Student	Deadline	Payment
Early	JPY 20,000	JPY 15,000	Nov 20, 2009	online
Late	JPY 25,000	JPY 20,000	Nov 30, 2009	on-the-door (registration desk)
On-site	JPY 30,000	JPY 25,000	Dec 1 - 2, 2009	on-the-door (registration desk)

Concerning on-the-door payments, only cash payments can be accepted.

投稿者 mpaul : 04:30| トラックバック

Accommodations

Hotel rates differ from day to day.
Please check-out the links below or contact the hotel directly.

	ホテルグランパシフィック LE DAIBA (Grand Pacific Le Daiba) [map] [reservations]
	「〒135-8701 東京都港区台場2-6-1」
	Daiba 2-6-1, Daiba, Minato-ku, Tokyo 135-8701, Japan
	(Tel) +81-3-5500-6711 (Fax) +81-3-5500-4507
	http://www.grandpacific.jp/eng (English) http://www.grandpacific.jp (Japanese)
	19,000 JPY〜 (1 person, 1 night)
	(U07) Daiba station [map] ←1min,180JPY→ (U08) Fune-no-kagakukan station [venue]

	三井ガーデンホテル汐留イタリア街 (Mitsui Garden Hotel Shiodome Italia-gai) [map] [reservations]
	「〒105-0021 東京都港区東新橋2-14-24」
	2-14-24, Higashi-shimbashi, Minato-ku, Tokyo 105-0021, Japan
	(Tel) +81-3-3431-1131 (Fax) +81-3-3431-2431
	http://www.gardenhotels.co.jp/eng/shiodome.html (English) http://www.gardenhotels.co.jp/shiodome/index.html (Japanese)
	10,300 JPY〜 (1 person, 1 night)
	(U02) Shiodome station [map] ←15min,310JPY→ (U08) Fune-no-kagakukan station [venue]

	ホテルヴィラフォンテーヌ汐留 (Hotel Villa Fontaine Shiodome) [map] [reservations]
	「〒105-0021 東京都港区東新橋1-9-2」
	1-9-2 Higashi-shinbashi Minato-ku, Tokyo 105-0021, Japan
	(Tel) +81-3-3569-2220 (Fax) +81-3-3569-2111
	http://www.hvf.jp/eng/shiodome.php (English) http://www.hvf.jp/shiodome (Japanese) http://www.hvf.jp/chi/shiodome.html (Chinese)
	10,000 JPY〜 (1 person, 1 night)
	(U02) Shiodome station [map] ←15min,310JPY→ (U08) Fune-no-kagakukan station [venue]

	ホテル日航東京 (Hotel Nikko Tokyo) [map] [reservations]
	「〒135-8625 東京都港区台場1丁目9番1号」
	1-9-1 Daiba, Minato-ku, Tokyo 135-8625, Japan
	(Tel) +81-3-5500-5500 (Fax) +81-3-5500-2525
	http://www.hnt.co.jp/en/index.html (English) http://www.hnt.co.jp/ (Japanese) http://www.jalhotels.com/cn/domestic/kanto/index.html#tokyo (Chinese)
	9,500 JPY〜 (1 person, 1 night)
	(U07) Daiba station [map] ←1min,180JPY→ (U08) Fune-no-kagakukan station [venue]

	ホテルトラスティ東京ベイサイド (Hotel Trusty Tokyo Bayside) [map] [reservations]
	「〒135-0063 東京都江東区有明3-1-5」
	3-1-5 Ariake, Koto-ki, Tokyo 135-0063, Japan
	(Tel) +81-3-6700-0001 (Fax) +81-3-6700-0007
	http://www.trusty.jp/tokyobayside/pdf/tokyobayside_e.pdf (English) http://www.trusty.jp/tokyobayside (Japanese)
	6,700 JPY〜 (1 person, 1 night)
	(U11) Kokusai-tenjijou-seimon station [map] ←6min,240JPY→ (U08) Fune-no-kagakukan station [venue]

	ホテルサンルート有明 (Hotel Sunroute Ariake) [map] [reservations]
	「〒135-0063 東京都江東区有明3-1-20」
	3-1-20 Ariake, Koutou-Ku, Tokyo 135-0063, Japan
	(Tel) +81-3-5530-3610
	http://www.sunroutehotel.jp/hari-eng/index.asp (English) http://www.sunroutehotel.jp/ariake/ (Japanese) http://www.sunroutehotel.jp/hari-chi/index.asp (Chinese)
	6,500 JPY〜 (1 person, 1 night)
	(U11) Kokusai-tenjijou-seimon station [map] ←6min,240JPY→ (U08) Fune-no-kagakukan station [venue]

	東京ベイ有明ワシントンホテル (Tokyo Bay Ariake Washington Hotel) [map] [reservations]
	「〒135-0063 東京都江東区有明3-1-28」
	3-1-28 Ariake, Koto-ku, Tokyo, Japan 135-0063, Japan
	(Tel) +81-3-5564-0111 (Fax) +81-3-5564-0525
	http://www.wh-rsv.com/english/tokyo_bay_ariake (English) http://www.wh-rsv.com/wh/hotels/ariake/index.html (Japanese) http://www.wh-rsv.com/chinese/tokyo_bay_ariake (Chinese)
	6,000 JPY〜 (1 person, 1 night)
	(U12) Ariake station [map] ←7min,240JPY→ (U08) Fune-no-kagakukan station [venue]

投稿者 mpaul : 04:25| トラックバック

Program

December 1, 2009

09:00

09:30

workshop registration

*Workshop Opening*
09:30	09:40	Welcome Remarks
		Satoshi NAKAMURA (NICT, Japan)
*Evaluation Campaign:* "Overview Talk"
09:40	10:10	Overview of the IWSLT 2009 Evaluation Campaign
		Michael PAUL (NICT, Japan)

coffee break

*Evaluation Campaign:* "Challenge Task"
10:30	11:00	Two methods for stabilizing MERT: NICT at IWSLT 2009
		Masao UTIYAMA, Hirofumi YAMAMOTO, Eiichiro SUMITA (NICT, Japan)
11:00	11:30	Low-Resource Machine Translation Using MaTrEx: The DCU Machine Translation System for IWSLT 2009
		Yanjun MA, Tsuyoshi OKITA, Özlem ÇETINOGLU, Jinhua DU, Andy WAY (Dublin City University, Ireland)
11:30	12:00	The CASIA Statistical Machine Translation System for IWSLT 2009
		Maoxi LI, Jiajun ZHANG , Yu ZHOU, Chengqing ZONG (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences; China)

lunch break

*Invited Talk*
13:00	14:00	Human Translation and Machine Translation
		Philipp KOEHN (University of Edinburgh, UK)
*Technical Paper:* "Oral I"
14:00	14:30	Morphological Pre-Processing for Turkish to English Statistical Machine Translation
		Arianna BISAZZA, Marcello FEDERICO (FBK-irst, Italy)
14:30	15:00	Enriching SCFG Rules Directly From Efficient Bilingual Chart Parsing
		Martin CMEJREK, Bowen ZHOU, Bing XIANG (IBM, USA)
15:00	15:30	A Unified Framework for Phrase-Based, Hierarchical, and Syntax-Based Statistical Machine Translation
		Hieu HOANG, Philipp KOEHN, Adam LOPEZ (Univ. Edinburgh, UK)

coffee break

*Evaluation Campaign:* "Poster I"
15:50	16:50	The TÜBITAK-UEKAE Statistical Machine Translation System for IWSLT 2009
		Coskun MERMER, Hamza KAYA, Mehmet Ugur DOGAN (TÜBITAK-UEKAE, Turkey)
15:50	16:50	The UOT System: Improve String-to-Tree Translation Using Head-Driven Phrasal Structure Grammar and Predicate-Argument Structures
		Xianchao WU, Takuya MATSUZAKI, Naoaki OKAZAKI, Yusuke MIYAO, Jun'ichi TSUJII (University of Tokyo, Japan)
15:50	16:50	The GREYC Translation Memory for the IWSLT2009 Evaluation Campaign: one step beyond translation memory
		Yves LEPAGE, Adrien LARDILLEUX, Julien GOSME (University of Caen, France)
15:50	16:50	The ICT Statistical Machine Translation Systems for the IWSLT 2009
		Haitao MI, Yang LIU, Tian XIA, Xinyan XIAO, Yang FENG, Jun XIE, Hao XIONG, Zhaopeng TU, Daqi ZHENG, Yanjuan LU, Qun LIU (Institute of Computing Technology, Chinese Academy of Sciences; China)
15:50	16:50	The University of Washington Machine Translation System for IWSLT 2009
		Mei YANG, Amittai AXELROD, Kevin DUH, Katrin KIRCHHOFF (University of Washington, USA)
15:50	16:50	Statistical Machine Translation adding Pattern based Machine translation in Chinese-English Translation
		Jin'ichi MURAKAMI, Masato TOKUHISA, Satoru IKEHARA (Tottori University, Japan)
*Demo Session*
16:50	17:20	Network-based Speech-to-Speech Translation
		Chiori HORI, Sakriani SAKTI, Michael PAUL, Satoshi NAKAMURA (NICT, Japan)
*Banquet*
18:00	20:00	Restaurant "LA TERRE" (Miraikan, 7F)

December 2, 2009

09:00

09:30

workshop registration

*Invited Talk*
09:30	10:30	Two-way Speech-to-Speech Translation for Communicating Across Language Barriers
		Premkumar NATARAJAN (BBN Technologies, USA)

coffee break

*Technical Paper:* "Oral II"
10:50	11:20	Structural Support Vector Machines for Log-Linear Approach in Statistical Machine Translation
		Katsuhiko HAYASHI (University of Doshisha, Japan); Taro WATANABE, Hajime TSUKADA and Hideki ISOZAKI (NTT, Japan)
11:20	11:50	Online Language Model Adaptation for Spoken Dialog Translation
		Germán SANCHIS-TRILLES (Universitat Politècnica de València, Spain); Mauro CETTOLO, Nicola BERTOLDI, Marcello FEDERICO (FBK-irst, Italy)

lunch break

*Invited Talk*
13:00	14:00	Monolingual Knowledge Acquisition and a Multilingual Information Environment
		Kentaro TORISAWA (NICT, Japan)
*Evaluation Campaign:* "Poster II"
14:00	15:00	AppTek Turkish-English Machine Translation System Description for IWSLT 2009
		Selçuk KÖPRÜ (Apptek Inc., Turkey)
14:00	15:00	LIG approach for IWSLT09 : Using Multiple Morphological Segmenters for Spoken Language Translation of Arabic
		Fethi BOUGARES, Laurent BESACIER, Hervé BLANCHON (LIG, France)
14:00	15:00	Barcelona Media SMT system description for the IWSLT 2009: introducing source context information
		Marta R. COSTA-JUSSA, Rafael E. BANCHS (Barcelona Media, Spain)
14:00	15:00	FBK @ IWSLT-2009
		Nicola BERTOLDI, Arianna BISAZZA, Mauro CETTOLO, Marcello FEDERICO (FBK-irst, Italy); Germán SANCHIS-TRILLES (Universitat Politècnica de València, Spain)
14:00	15:00	LIUM's Statistical Machine Translation Systems for IWSLT 2009
		Holger SCHWENK, Loïc BARRAULT, Yannick ESTÈVE, Patrik LAMBER (University of le Mans, France)
14:00	15:00	I²R's Machine Translation System for IWSLT 2009
		Xiangyu DUAN, Deyi XIONG, Hui ZHANG, Min ZHANG, Haizhou LI (Institute for Infocomm Research, Singapore)

coffee break

*Evaluation Campaign:* "BTEC Task"
15:20	15:50	The NUS Statistical Machine Translation System for IWSLT 2009
		Preslav NAKOV, Chang LIU, Wei LU, Hwee Tou NG (National University of Singapore, Singapore)
15:50	16:20	The UPV Translation System for IWSLT 2009
		Guillem GASCÓ, Joan Andreu SÁNCHEZ (Universitat Politècnica de València, Spain)
16:20	16:50	The MIT-LL/AFRL System for IWSLT 2009
		Wade SHEN, Brian DELANEY, Arya Ryan AMINZADEH (MIT Lincoln Laboratory, USA); Timothy ANDERSON, Raymond SLYH (Air Force Research Laboratory, USA)
*Workshop Closing*
16:50	17:00	Closing Remarks
		Marcello FEDERICO (FBK-irst, Italy)

投稿者 mpaul : 04:20| トラックバック

Keynote Speeches

Keynote Speech 1

Human Translation and Machine Translation
	Philipp KOEHN (University of Edinburgh, UK)
	While most of recent machine translation work has focus on the gisting application (i.e., translating web pages), another important application is to aid human translators. To build better computer aided translation tools, we first need to understand how human translators work. We discuss how human translators work and what tools they typically use. We also build a novel tool that offers post-editing, interactive sentence completion, and display of translation options (online at www.caitra.org). We collected timing logs on interactions with the tool, which allows detailed analysis of translator behavior.

Keynote Speech 2

Two-way Speech-to-Speech Translation for Communicating Across Language Barriers
	Premkumar NATARAJAN (BBN Technologies, USA)
	Two-way speech-to-speech (S2S) translation is a spoken language application that integrates multiple technologies including speech recognition, machine translation, text-to-speech synthesis, and dialog management. In recent years, research into S2S systems has resulted in several modeling techniques for improving coverage on broad domains and rapid configuration for new language pairs or domains. This talk will highlight recent advances in S2S area that range from improvements in component technologies to improvements in the end-to-end system for mobile use. I will also present metrics for evaluating the S2S technology, a methodology for determining the impact of different causes of errors, and future directions for research and development.

Keynote Speech 3

Monolingual Knowledge Acquisition and a Multilingual Information Environment
	Kentaro TORISAWA (NICT, Japan)
	Large-scale knowledge acquisition from the Web has been a popular research topic in the last five years. This talk gives an overview of our current project aiming at the acquisition of a large scale semantic network from the Web, and in the talk I explore its possible interaction with machine translation research. Particularly, I would like to focus on two topics; multilingual corpora as source of knowledge and the applications of machine translation enabled by our technology. I will discuss a framework of bilingual co-training that gives a marked improvement in accuracy of the acquired knowledge by using two corpora written in two different languages. Also, I will show our technology can enable a new type of tasks for machine translation in Web applications.

投稿者 mpaul : 04:15| トラックバック

Proceedings

- Author Index -

Evaluation Campaign
pp.1-18	paper	slides	bib	Overview of the IWSLT 2009 Evaluation Campaign
				Michael PAUL
pp.19-23	paper	(not yet)	bib	apptek
				AppTek Turkish-English Machine Translation System Description for IWSLT 2009
				Selçuk KÖPRÜ
pp.24-28	paper	poster	bib	bmrc
				Barcelona Media SMT system description for the IWSLT 2009: introducing source context information
				Marta R. COSTA-JUSSA, Rafael E. BANCHS
pp.29-36	paper	slides	bib	dcu
				Low-Resource Machine Translation Using MaTrEx: The DCU Machine Translation System for IWSLT 2009
				Yanjun MA, Tsuyoshi OKITA, Özlem ÇETINOGLU, Jinhua DU, Andy WAY
pp.37-44	paper	poster	bib	fbk
				FBK @ IWSLT-2009
				Nicola BERTOLDI, Arianna BISAZZA, Mauro CETTOLO, Marcello FEDERICO (FBK-irst, Italy); Germán SANCHIS-TRILLES (Universitat Politècnica de València, Spain)
pp.45-49	paper	poster	bib	greyc
				The GREYC Translation Memory for the IWSLT 2009 Evaluation Campaign: one step beyond translation memory
				Yves LEPAGE, Adrien LARDILLEUX, Julien GOSME
pp.50-54	paper	poster	bib	i2r
				I²R's Machine Translation System for IWSLT 2009
				Xiangyu DUAN, Deyi XIONG, Hui ZHANG, Min ZHANG, Haizhou LI
pp.55-59	paper	poster	bib	ict
				The ICT Statistical Machine Translation Systems for the IWSLT 2009
				Haitao MI, Yang LIU, Tian XIA, Xinyan XIAO, Yang FENG, Jun XIE, Hao XIONG, Zhaopeng TU, Daqi ZHENG, Yajuan LU, Qun LIU
pp.60-64	paper	poster	bib	lig
				LIG approach for IWSLT09 : Using Multiple Morphological Segmenters for Spoken Language Translation of Arabic
				Fethi BOUGARES, Laurent BESACIER, Hervé BLANCHON (LIG, France)
pp.65-70	paper	poster	bib	lium
				LIUM's Statistical Machine Translation Systems for IWSLT 2009
				Holger SCHWENK, Loïc BARRAULT, Yannick ESTÈVE, Patrik LAMBERT
pp.71-78	paper	slides	bib	mit
				The MIT-LL/AFRL IWSLT-2009 System
				Wade SHEN, Brian DELANEY, Arya Ryan AMINZADEH (MIT Lincoln Laboratory, USA); Timothy ANDERSON, Raymond SLYH (Air Force Research Laboratory)
pp.79-82	paper	slides	bib	nict
				Two methods for stabilizing MERT: NICT at IWSLT 2009
				Masao UTIYAMA, Hirofumi YAMAMOTO, Eiichiro SUMITA
pp.83-90	paper	slides	bib	nlpr
				The CASIA Statistical Machine Translation System for IWSLT 2009
				Maoxi LI, Jiajun ZHANG, Yu ZHOU, Chengqing ZONG
pp.91-98	paper	slides	bib	nus
				The NUS Statistical Machine Translation System for IWSLT 2009
				Preslav NAKOV, Chang LIU, Wei LU, Hwee Tou NG
pp.99-106	paper	poster	bib	tokyo
				The UOT System: Improve String-to-Tree Translation Using Head-Driven Phrasal Structure Grammar and Predicate-Argument Structures
				Xianchao WU, Takuya MATSUZAKI, Naoaki OKAZAKI, Yusuke MIYAO, Jun'ichi TSUJII
pp.107-112	paper	poster	bib	tottori
				Statistical Machine Translation adding Pattern-based Machine Translation in Chinese-English Translation
				Jin'ichi MURAKAMI, Masato TOKUHISA, Satoru IKEHARA
pp.113-117	paper	poster	bib	tubitak
				The TÜBITAK-UEKAE Statistical Machine Translation System for IWSLT 2009
				Coskun MERMER, Hamza KAYA, Mehmet Ugur DOGAN
pp.118-123	paper	slides	bib	upv
				UPV Translation System for IWSLT 2009
				Guillem GASCÓ, Joan Andreu SÁNCHEZ
pp.124-128	paper	poster	bib	uw
				The University of Washington Machine Translation System for IWSLT 2009
				Mei YANG, Amittai AXELROD, Kevin DUH, Katrin KIRCHHOFF

Technical Paper
pp.129-135	paper	slides	bib	Morphological Pre-Processing for Turkish to English Statistical Machine Translation
				Arianna BISAZZA, Marcello FEDERICO
pp.136-143	paper	slides	bib	Enriching SCFG Rules Directly From Efficient Bilingual Chart Parsing
				Martin CMEJREK, Bowen ZHOU, Bing XIANG
pp.144-151	paper	slides	bib	Structural Support Vector Machines for Log-Linear Approach in Statistical Machine Translation
				Katsuhiko HAYASHI, Taro WATANABE, Hajime TSUKADA, Hideki ISOZAKI
pp.152-159	paper	slides	bib	A Unified Framework for Phrase-Based, Hierarchical, and Syntax-Based Statistical Machine Translation
				Hieu HOANG, Philipp KOEHN, Adam LOPEZ
pp.160-167	paper	slides	bib	Online Language Model Adaptation for Spoken Dialog Translation
				German SANCHIS-TRILLES, Mauro CETTOLO, Nicola BERTOLDI, Marcello FEDERICO

Demo
pp.168-168	paper	slides	bib	Network-based Speech-to-Speech Translation
				Chiori HORI, Sakriani SAKTI, Michael PAUL, Noriyuki KIMURA, Yutaka ASHIKARI, Ryosuke ISOTANI, Eiichiro SUMITA, Satoshi NAKAMURA (NICT, Japan)

Keynote Speech
-	abstract	slides	-	Human Translation and Machine Translation
				Philipp KOEHN (University of Edinburgh, UK)
-	abstract	(not yet)	-	Two-way Speech-to-Speech Translation for Communicating Across Language Barriers
				Premkumar NATARAJAN (BBN Technologies, USA)
-	abstract	slides	-	Monolingual Knowledge Acquisition and a Multilingual Information Environment
				Kentaro TORISAWA (NICT, Japan)

投稿者 mpaul : 04:12| トラックバック

Author Index

A-B-C-D-E-F-G-H-I-K-L-M-N-O-P-S-T-U-W-X-Y-Z

A
AMINZADEH, Arya Ryan	71
ANDERSON, Timothy	71
ASHIKARI, Yutaka	168
AXELROD, Amittai	124
B
BANCHS, Rafael E.	24
BARRAULT, Loïc	65
BERTOLDI, Nicola	37, 160
BESACIER, Laurent	60
BISAZZA, Arianna	37, 129
BLANCHON, Hervé	60
BOUGARES, Fethi	60
C
ÇETINOGLU, Özlem	29
CETTOLO, Mauro	37, 160
CMEJREK, Martin	136
COSTA-JUSSÀ, Marta R.	24
D
DELANEY, Brian	71
DOGAN, Mehmet Ugur	113
DU, Jinhua	29
DUAN, Xiangyu	50
DUH, Kevin	124
E
ESTÈVE, Yannick	65
F
FEDERICO, Marcello	37, 129, 160
FENG, Yang	55
G
GASCÓ, Guillem	118
GOSME, Julian	45
H
HAYASHI, Katsuhiko	144
HOANG, Hieu	152
HORI, Chiori	168
I
IKEHARA, Satoru	107
ISOTANI, Ryosuke	168
ISOZAKI, Hideki	144
K
KAYA, Hamza	113
KIMURA, Noriyuki	168
KIRCHHOFF, Katrin	124
KOEHN, Philipp	152
KÖPRÜ, Selçuk	19
L
LAMBERT, Patrik	65
LARDILLEUX, Adrien	45
LEPAGE, Yves	45
LI, Haizhou	50
LI, Maoxi	83
LIU, Chang	91
LIU, Qun	55
LIU, Yang	55
LOPEZ, Adam	152
LU, Wei	91
LU, Yajuan	55
M
MA, Yanjun	29
MATSUZAKI, Takuya	99
MERMER, Coskun	113
MI, Haitao	55
MIYAO, Yusuke	99
MURAKAMI, Jin'ichi	107
N
NAKAMURA, Satoshi	168
NAKOV, Preslav	91
NG, Hwee Tou	91
O
OKAZAKI, Naoaki	99
OKITA, Tsuyoshi	29
P
PAUL, Michael	1, 168
S
SAKTI, Sakriani	168
SÁNCHEZ, Joan Andreu	118
SANCHIS-TRILLES, Germán	37, 160
SCHWENK, Holger	65
SHEN, Wade	71
SLYH, Raymond	71
SUMITA, Eiichiro	79, 168
T
TOKUHISA, Masato	107
TSUJII, Jun'ichi	99
TSUKADA, Hajime	144
TU, Zhaopeng	55
U
UTIYAMA, Masao	79
W
WATANABE, Taro	144
WAY, Andy	29
WU, Xianchao	99
X
XIA, Tian	55
XIANG, Bing	136
XIAO, Xinyan	55
XIE, Jun	55
XIONG, Deyi	50
XIONG, Hao	55
Y
YAMAMOTO, Hirofumi	79
YANG, Mei	124
Z
ZHANG, Hui	50
ZHANG, Jiajun	83
ZHANG, Min	50
ZHENG, Daqi	55
ZHOU, Bowen	136
ZHOU, Yu	83
ZONG, Chengqing	83

投稿者 mpaul : 04:10| トラックバック

Bibliography

@inproceedings{iwslt09:EC:overview,
author=	{Michael Paul},
title=	{{Overview of the IWSLT 2009 Evaluation Campaign}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{1-18},
}

@inproceedings{iwslt09:EC:apptek,
author=	{Sel\c{o}uk K\"{o}pr\"{u}},
title=	{{AppTek Turkish-English Machine Translation System Description for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{19-23},
}

@inproceedings{iwslt09:EC:bmrc,
author=	{Marta R. Costa-Juss\`{a} and Rafael E. Banchs},
title=	{{Barcelona Media SMT system description for the IWSLT 2009: introducing source context information}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{24-28},
}

@inproceedings{iwslt09:EC:dcu,
author=	{Yanjun Ma and Tsuyoshi Okita and \"{O}zlem \c{C}etino\u{g}lu and Jinhua Du and Andy Way},
title=	{{Low-Resource Machine Translation Using MaTrEx: The DCU Machine Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{29-36},
}

@inproceedings{iwslt09:EC:fbk,
author=	{Nicola Bertoldi and Arianna Bisazza and Mauro Cettolo and Germ\'{a};n Sanchis-Trilles and Marcello Federico},
title=	{{FBK @ IWSLT-2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{37-44},
}

@inproceedings{iwslt09:EC:greyc,
author=	{Yves Lepage and Adrien Lardilleux and Julien Gosme},
title=	{{The GREYC Translation Memory for the IWSLT 2009 Evaluation Campaign: one step beyond translation memory}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{45-49},
}

@inproceedings{iwslt09:EC:i2r,
author=	{Xiangyu Duan and Deyi Xiong and Hui Zhang and Min Zhang and Haizhou Li},
title=	{{I${}^{2}$R's Machine Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{50-54},
}

@inproceedings{iwslt09:EC:ict,
author=	{Haitao Mi and Yang Liu and Tian Xia and Xinyan Xiao and Yang Feng and Jun Xie and Hao Xiong and Zhaopeng Tu and Daqi Zheng and Yajuan Lu and Qun Liu},
title=	{{The ICT Statistical Machine Translation Systems for the IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{55-59},
}

@inproceedings{iwslt09:EC:lig,
author=	{Fethi Bougares and Laurent Besacier and Herv\'{e} Blanchon},
title=	{{LIG approach for IWSLT09 : Using Multiple Morphological Segmenters for Spoken Language Translation of Arabic}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{60-64},
}

@inproceedings{iwslt09:EC:lium,
author=	{Holger Schwenk and Lo\"{i}c Barrault and Yannick Est\`{e}ve and Patrik Lambert},
title=	{{LIUM's Statistical Machine Translation Systems for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{65-70},
}

@inproceedings{iwslt09:EC:mit,
author=	{Wade Shen and Brian Delaney and Arya Ryan Aminzadeh and Timothy Anderson and Raymond Slyh},
title=	{{The MIT-LL/AFRL IWSLT-2009 System}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{71-78},
}

@inproceedings{iwslt09:EC:nict,
author=	{Masao Utiyama and Hirofumi Yamamoto and Eiichiro Sumita},
title=	{{Two methods for stabilizing MERT: NICT at IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{79-82},
}

@inproceedings{iwslt09:EC:nlpr,
author=	{Maoxi Li and Jiajun Zhang and Yu Zhou and Chengqing Zong},
title=	{{The CASIA Statistical Machine Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{83-90},
}

@inproceedings{iwslt09:EC:nus,
author=	{Preslav Nakov and Chang Liu and Wei Lu and Hwee Tou Ng},
title=	{{The NUS Statistical Machine Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{91-98},
}

@inproceedings{iwslt09:EC:tokyo,
author=	{Xianchao Wu and Takuya Matsuzaki and Naoaki Okazaki and Yusuke Miyao and Jun'ichi Tsujii},
title=	{{The UOT System: Improve String-to-Tree Translation Using Head-Driven Phrasal Structure Grammar and Predicate-Argument Structures}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{99-106},
}

@inproceedings{iwslt09:EC:tottori,
author=	{Jin'ichi Murakami and Masato Tokuhisa and Satoru Ikehara},
title=	{{Statistical Machine Translation adding Pattern-based Machine translation in Chinese-English Translation}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{107-112},
}

@inproceedings{iwslt09:EC:tubitak,
author=	{{Co\c{s}kun} Mermer and Hamza Kaya and Mehmet U\v{g}ur Do\v{g}an},
title=	{{The T\"{U}BITAK-UEKAE Statistical Machine Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{113-117},
}

@inproceedings{iwslt09:EC:upv,
author=	{Guillem Gasc\'{o} and Joan Andreu S\'{a}nchez},
title=	{{UPV Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{118-123},
}

@inproceedings{iwslt09:EC:uw,
author=	{Mei Yang and Amittai Axelrod and Kevin Duh and Katrin Kirchhoff},
title=	{{The University of Washington Machine Translation System for IWSLT 2009}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{124-128},
}

@inproceedings{iwslt09:TP:bisazza,
author=	{Arianna Bisazza and Marcello Federico},
title=	{{Morphological Pre-Processing for Turkish to English Statistical Machine Translation}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{129-135},
}

@inproceedings{iwslt09:TP:cmejrek,
author=	{Martin Cmejrek and Bowen Zhou and Bing Xiang},
title=	{{Enriching SCFG Rules Directly From Efficient Bilingual Chart Parsing}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{136-143},
}

@inproceedings{iwslt09:TP:hayashi,
author=	{Katsuhiko Hayashi and Taro Watanabe and Hajime Tsukada and Hideki Isozaki},
title=	{{Structural Support Vector Machines for Log-Linear Approach in Statistical Machine Translation}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{144-151},
}

@inproceedings{iwslt09:TP:hoang,
author=	{Hieu Hoang and Philipp Koehn and Adam Lopez},
title=	{{A Unified Framework for Phrase-Based, Hierarchical, and Syntax-Based Statistical Machine Translation}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{152-159},
}

@inproceedings{iwslt09:TP:sanchis,
author=	{Germ\'{a}n Sanchis-Trilles and Mauro Cettolo and Nicola Bertoldi and Marcello Federico},
title=	{{Online Language Model Adaptation for Spoken Dialog Translation}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{160-167},
}

@inproceedings{iwslt09:DEMO:nict,
author=	{Chiori Hori and Sakriani Sakti and Michael Paul and Noriyuki Kimura and Yutaka Ashikari and Ryosuke Isotani and Eiichiro Sumita and Satoshi Nakamura},
title=	{{Network-based Speech-to-Speech Translation}},
year=	{2009},
booktitle=	{Proc. of the International Workshop on Spoken Language Translation},
address=	{Tokyo, Japan},
pages=	{168},
}

投稿者 mpaul : 04:08| トラックバック

Venue

National Museum of Emerging Science and Innovation

2-41, Aomi, Koto-ku, Tokyo, Japan
(Tel) +81-3-3570-9151 (Fax) +81-3-3570-9150
http://www.miraikan.jst.go.jp/en

entrance @ 1F ( floor map)
workshop @ 7F ( floor map)

Access

Tourism Info

投稿者 mpaul : 04:00| トラックバック

Gallery

IWSLT 2009
December 1-2, 2009
National Museum of Emerging Science and Innovation
Tokyo, Japan

	December 1

	December 2

投稿者 mpaul : 03:30| トラックバック

Organizers

Organizers

Alex Waibel (CMU, USA / UKA, Germany)
Marcello Federico (FBK, Italy)
Satoshi Nakamura (NICT, Japan)

Chairs

Eiichiro Sumita (NICT, Japan; Workshop)
Michael Paul (NICT, Japan; Evaluation Campaign)
Marcello Federico (FBK, Italy; Technical Paper)

Program Committee

Laurent Besacier (LIG, France)
Francisco Casacuberta (ITI-UPV, Spain)
Boxing Chen (NRC, Canda)
Philipp Koehn (Univ. Edinburgh, UK)
Philippe Langlais (Univ. Montreal, Canada)
Geunbae Lee (Postech, Korea)
Yves Lepage (GREYC, France)
Haizhou Li (I2R, Singapore)
Qun Liu (ICT, China)
José B. Mariño (TALP-UPC, Spain)
Coskun Mermer (TUBITAK, Turkey)
Christof Monz (QMUL, UK)
Hermann Ney (RWTH, Germany)
Holger Schwenk (LIUM, France)
Wade Shen (MIT-LL, USA)
Hajime Tsukada (NTT, Japan)
Haifeng Wang (TOSHIBA, China)
Andy Way (DCU, Ireland)
Chengqing Zong (CASIA, China)

Local Arrangements

Mari Oku (NICT, Japan)

Supporting Organizations

	National Institute of Information and Communications Technology
	The Scientific and Technological Research Council of Turkey (TUBITAK) National Research Institute of Electronics and Cryptology (UEKAE)

投稿者 mpaul : 03:00| トラックバック

Contact

WORKSHOP ORGANIZATION
Eiichiro Sumita
(reverse) email: jp *dot* co *dot* nict *at* sumita *dot* eiichiro

EVALUATION CAMPAIGN
Michael Paul
(reverse) email: jp *dot* go *dot* nict *at* paul *dot* michael

TECHNICAL PAPER
Marcello Federico
(reverse) email: eu *dot* fbk *at* federico

LOCAL ARRANGEMENT
Mari Oku
(reverse) email: jp *dot* go *dot* nict *dot* khn *at* iwsltlocal09

National Institute of Information and Communications Technology (NICT)
Knowledge Creating Communciation Research Center
MASTAR Project

2-2-2 Hikaridai, Keihanna Science City, Kyoto 619-0288, Japan
TEL: +81-774-95-1301
FAX: +81-774-95-1308

投稿者 mpaul : 02:00| トラックバック

References

Events Co-located with IWSLT 2009

IUCS 2009 conference

IWSLT Evaluation Campaigns

IWSLT 2010, http://iwslt2010.fbk.eu
IWSLT 2009, IWSLT2009
IWSLT 2008, IWSLT2008
IWSLT 2007, http://iwslt07.itc.it
IWSLT 2006, IWSLT2006
IWSLT 2005, http://www.is.cs.cmu.edu/iwslt2005
IWSLT 2004, IWSLT2004
C-STAR, http://www.c-star.org/

投稿者 mpaul : 01:00| トラックバック

2009年12月アーカイブ

2009年12月 1日

Theme

Evaluation Campaign

Important Dates

Downloads

Submission

Run Submission Guidelines

Automatic Evaluation Server

Registration

Accommodations

Program

Keynote Speeches

Proceedings

Author Index

Bibliography

Venue

Gallery

Organizers

Contact

References

月別 アーカイブ

ウェブページ

検索

このアーカイブについて

月別アーカイブ