Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Machine translation 'translation' mode giving "</S> </S> </S> </S> </S> </S> . </S> . . </S> </S>" #110

Open
HuidaQ opened this issue Oct 6, 2017 · 2 comments

Comments

@HuidaQ
Copy link

HuidaQ commented Oct 6, 2017

I'm trying to train a NMT model on commoncrawl data (from http://www.statmt.org/wmt15/translation-task.html). The training seems to be doing fine. A paste of the partial log:

Training status:
         batch_interrupt_received: False
         epoch_interrupt_received: False
         epoch_started: True
         epochs_done: 0
         iterations_done: 558
         received_first_batch: True
         resumed_from: None
         training_started: True
Log records from the iteration 558:
         decoder_cost_cost: 134.178924561


-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Training status:
         batch_interrupt_received: False
         epoch_interrupt_received: False
         epoch_started: True
         epochs_done: 0
         iterations_done: 559
         received_first_batch: True
         resumed_from: None
         training_started: True
Log records from the iteration 559:
         decoder_cost_cost: 148.906814575


Input :  Ce morceau de code fournit un aperçu de votre travail , une brève description , et un bouton &gt; Achetez maintenant . </S>
Target:  This bit of code provides a preview of your work , a brief description , and a &gt; Buy Now button . </S>
Sample:  information taxi that It to <UNK> to <UNK> while work . </S>
Sample cost:  230.718

Input :  C ’ est la question écrite que pose sans <UNK> la société de gestion <UNK> Active <UNK> à l&apos; assemblée 2010 de <UNK> . </S>
Target:  This is the written question which puts the asset manager company <UNK> Active Investors to the 2010 of the <UNK> . </S>
Sample:  a share the look . </S>
Sample cost:  314.725


-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Training status:
         batch_interrupt_received: False
         epoch_interrupt_received: False
         epoch_started: True
         epochs_done: 0
         iterations_done: 560
         received_first_batch: True
         resumed_from: None
         training_started: True
Log records from the iteration 560:
         decoder_cost_cost: 164.382553101


INFO:machine_translation.checkpoint: Saving model
INFO:machine_translation.checkpoint: ...saving parameters
INFO:machine_translation.checkpoint: ...saving iteration state
INFO:machine_translation.checkpoint: ...saving log
INFO:machine_translation.checkpoint: Model saved, took 13.1227090359 seconds.
.
.
.

But when I use the translation mode (from https://github.com/mila-udem/blocks-examples/pull/43/files#r62537666), even if I pick a sentence from the training data itself, it'll give me a sequence of '', or 'the', or '' alway. Here's the translation log:

> blocks-examples $ python -m machine_translation --proto get_config_fr2en_cc --mode translate --test-file data/my_test.fr.tok                                                                                                      [35/102]
INFO:__main__:Model options:
{'batch_size': 80,
 'beam_size': 12,
 'bleu_script': './data/multi-bleu.perl',
 'bleu_val_freq': 5000,
 'bos_token': '<S>',
 'dec_embed': 620,
 'dec_nhids': 1000,
 'dropout': 1.0,
 'enc_embed': 620,
 'enc_nhids': 1000,
 'eos_token': '</S>',
 'finish_after': 1000000,
 'hook_samples': 2,
 'normalized_bleu': True,
 'output_val_set': True,
 'reload': True,
 'sampling_freq': 13,
 'save_freq': 10,
 'saveto': 'search_model_fr2en_cc',
 'seq_len': 50,
 'sort_k_batches': 12,
 'src_data': './data/commoncrawl.fr-en.fr.tok.shuf',
 'src_vocab': './data/vocab.fr-en.fr.pkl',
 'src_vocab_size': 30000,
 'step_clipping': 1.0,
 'step_rule': 'AdaDelta',
 'stream': 'stream',
 'test_set': 'data/my_test.fr.tok',
 'trg_data': './data/commoncrawl.fr-en.en.tok.shuf',
 'trg_vocab': './data/vocab.fr-en.en.pkl',
 'trg_vocab_size': 30000,
 'unk_id': 1,
 'unk_token': '<UNK>',
 'val_burn_in': 80000,
 'val_set': './data/newstest2013.fr.tok',
 'val_set_grndtruth': './data/newstest2013.en.tok',
 'val_set_out': 'search_model_fr2en_cc/validation_out.txt',
 'weight_noise_ff': False,
 'weight_noise_rec': False,
 'weight_scale': 0.01}
INFO:machine_translation:Building RNN encoder-decoder
INFO:machine_translation:Creating theano variables
INFO:machine_translation:Building sampling model
INFO:machine_translation:Loading the model..
INFO:machine_translation.checkpoint: Loaded to CG (1000,)        : /bidirectionalencoder/bidirectionalwmt15/backward.initial_state
INFO:machine_translation.checkpoint: Loaded to CG (2000,)        : /bidirectionalencoder/back_fork/fork_gate_inputs.b
INFO:machine_translation.checkpoint: Loaded to CG (620, 2000)    : /bidirectionalencoder/back_fork/fork_gate_inputs.W
INFO:machine_translation.checkpoint: Loaded to CG (1000,)        : /bidirectionalencoder/back_fork/fork_inputs.b
INFO:machine_translation.checkpoint: Loaded to CG (620, 1000)    : /bidirectionalencoder/back_fork/fork_inputs.W
INFO:machine_translation.checkpoint: Loaded to CG (1000,)        : /bidirectionalencoder/bidirectionalwmt15/forward.initial_state
INFO:machine_translation.checkpoint: Loaded to CG (2000,)        : /bidirectionalencoder/fwd_fork/fork_gate_inputs.b
INFO:machine_translation.checkpoint: Loaded to CG (620, 2000)    : /bidirectionalencoder/fwd_fork/fork_gate_inputs.W
INFO:machine_translation.checkpoint: Loaded to CG (1000,)        : /bidirectionalencoder/fwd_fork/fork_inputs.b
INFO:machine_translation.checkpoint: Loaded to CG (620, 1000)    : /bidirectionalencoder/fwd_fork/fork_inputs.W
INFO:machine_translation.checkpoint: Loaded to CG (1000,)        : /decoder/sequencegenerator/att_trans/decoder/state_initializer/linear_0.b
INFO:machine_translation.checkpoint: Loaded to CG (30000, 620)   : /bidirectionalencoder/embeddings.W
INFO:machine_translation.checkpoint: Loaded to CG (1000, 2000)   : /bidirectionalencoder/bidirectionalwmt15/forward.state_to_gates
INFO:machine_translation.checkpoint: Loaded to CG (1000, 1000)   : /bidirectionalencoder/bidirectionalwmt15/forward.state_to_state
INFO:machine_translation.checkpoint: Loaded to CG (1000, 2000)   : /bidirectionalencoder/bidirectionalwmt15/backward.state_to_gates
INFO:machine_translation.checkpoint: Loaded to CG (1000, 1000)   : /bidirectionalencoder/bidirectionalwmt15/backward.state_to_state
INFO:machine_translation.checkpoint: Loaded to CG (1000, 1000)   : /decoder/sequencegenerator/att_trans/decoder/state_initializer/linear_0.W
INFO:machine_translation.checkpoint: Loaded to CG (1000, 2000)   : /decoder/sequencegenerator/att_trans/decoder.state_to_gates
INFO:machine_translation.checkpoint: Loaded to CG (2000, 1000)   : /decoder/sequencegenerator/att_trans/attention/preprocess.W
INFO:machine_translation.checkpoint: Loaded to CG (1000,)        : /decoder/sequencegenerator/att_trans/attention/preprocess.b
INFO:machine_translation.checkpoint: Loaded to CG (1000, 1000)   : /decoder/sequencegenerator/att_trans/attention/state_trans/transform_states.W
INFO:machine_translation.checkpoint: Loaded to CG (1000, 1)      : /decoder/sequencegenerator/att_trans/attention/energy_comp/linear.W
INFO:machine_translation.checkpoint: Loaded to CG (2000, 2000)   : /decoder/sequencegenerator/att_trans/distribute/fork_gate_inputs.W
INFO:machine_translation.checkpoint: Loaded to CG (1000, 1000)   : /decoder/sequencegenerator/readout/merge/transform_states.W
INFO:machine_translation.checkpoint: Loaded to CG (30000, 620)   : /decoder/sequencegenerator/readout/lookupfeedbackwmt15/lookuptable.W
INFO:machine_translation.checkpoint: Loaded to CG (620, 1000)    : /decoder/sequencegenerator/readout/merge/transform_feedback.W
INFO:machine_translation.checkpoint: Loaded to CG (2000, 1000)   : /decoder/sequencegenerator/readout/merge/transform_weighted_averages.W
INFO:machine_translation.checkpoint: Loaded to CG (1000,)        : /decoder/sequencegenerator/readout/initializablefeedforwardsequence/maxout_bias.b
INFO:machine_translation.checkpoint: Loaded to CG (500, 620)     : /decoder/sequencegenerator/readout/initializablefeedforwardsequence/softmax0.W
INFO:machine_translation.checkpoint: Loaded to CG (620, 30000)   : /decoder/sequencegenerator/readout/initializablefeedforwardsequence/softmax1.W
INFO:machine_translation.checkpoint: Loaded to CG (30000,)       : /decoder/sequencegenerator/readout/initializablefeedforwardsequence/softmax1.b
INFO:machine_translation.checkpoint: Loaded to CG (620, 2000)    : /decoder/sequencegenerator/fork/fork_gate_inputs.W
INFO:machine_translation.checkpoint: Loaded to CG (2000,)        : /decoder/sequencegenerator/fork/fork_gate_inputs.b
INFO:machine_translation.checkpoint: Loaded to CG (1000, 1000)   : /decoder/sequencegenerator/att_trans/decoder.state_to_state
INFO:machine_translation.checkpoint: Loaded to CG (2000, 1000)   : /decoder/sequencegenerator/att_trans/distribute/fork_inputs.W
INFO:machine_translation.checkpoint: Loaded to CG (620, 1000)    : /decoder/sequencegenerator/fork/fork_inputs.W
INFO:machine_translation.checkpoint: Loaded to CG (1000,)        : /decoder/sequencegenerator/fork/fork_inputs.b
INFO:machine_translation.checkpoint: Number of parameters loaded for computation graph: 37
INFO:machine_translation:Started translation:
INFO:machine_translation:Source: ([769, 6, 2979, 2, 1177, 7173, 27, 180, 79, 3494, 3, 2263, 5, 734, 4, 29999],)
INFO:machine_translation:Translated: </S> </S> </S> </S> </S> </S> . </S> . . </S> </S>
INFO:machine_translation:Total cost of the test: 0.929869651794

'my_test.fr.tok' file only has 1 line:
Sur la baie de San Antonio vous avez tous commerces , bars et restaurants .

Appreciate any help. Thanks.

@dmitriy-serdyuk
Copy link
Contributor

Perhaps, something is wrong with your dataset. This doesn't look good:

Input :  Ce morceau de code fournit un aperçu de votre travail , une brève description , et un bouton &gt; Achetez maintenant . </S>
Target:  This bit of code provides a preview of your work , a brief description , and a &gt; Buy Now button . </S>
Sample:  information taxi that It to <UNK> to <UNK> while work . </S>
Sample cost:  230.718

@HuidaQ
Copy link
Author

HuidaQ commented Oct 9, 2017

I tried the default dataset in the prepare_data.py script (parallel-nc-v10), got the same thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants