1

Traceback (most recent call last): File "train.py", line 18, in tf.app.run(main=nmt.main, argv=[os.getcwd() + '\nmt\nmt\nmt.py'] + unparsed) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/nmt.py", line 551, in main run_main(FLAGS, default_hparams, train_fn, inference_fn) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/nmt.py", line 544, in run_main train_fn(hparams, target_session=target_session) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/train.py", line 271, in train sample_tgt_data) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/train.py", line 142, in run_full_eval sample_src_data, sample_tgt_data) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/train.py", line 55, in run_sample_decode infer_model.batch_size_placeholder, summary_writer) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/train.py", line 454, in _sample_decode utils.print_out(b" src: " + utils.format_sentence(src_data[decode_id], hparams.subword_option))
File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/utils/misc_utils.py", line 193, in format_sentence sentence = format_spm_text(sentence) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/utils/misc_utils.py", line 181, in format_spm_text return u"".join(format_text(symbols).decode("utf-8").split()).replace( File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeEncodeError: 'ascii' codec can't encode character u'\u2581' in position 0: ordinal not in range(128)

Traceback (most recent call last): File "train.py", line 18, in tf.app.run(main=nmt.main, argv=[os.getcwd() + '\nmt\nmt\nmt.py'] + unparsed) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/nmt.py", line 551, in main run_main(FLAGS, default_hparams, train_fn, inference_fn) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/nmt.py", line 544, in run_main train_fn(hparams, target_session=target_session) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/train.py", line 271, in train sample_tgt_data) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/train.py", line 142, in run_full_eval sample_src_data, sample_tgt_data) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/train.py", line 55, in run_sample_decode infer_model.batch_size_placeholder, summary_writer) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/train.py", line 454, in _sample_decode utils.print_out(b" src: " + utils.format_sentence(src_data[decode_id], hparams.subword_option))
File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/utils/misc_utils.py", line 193, in format_sentence sentence = format_spm_text(sentence) File "/home/paperspace/Desktop/nmt-chatbot/nmt/nmt/utils/misc_utils.py", line 181, in format_spm_text return u"".join(format_text(symbols).decode("utf-8").split()).replace( File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeEncodeError: 'ascii' codec can't encode character u'\u2581' in position 0: ordinal not in range(128)

I am getting an error in these lines of code:

def format_spm_text(symbols):
  """Decode a text in SPM (https://github.com/google/sentencepiece) 
  format."""
  return u"".join(format_text(symbols).decode("utf-8").split()).replace(
      u"\u2581", u" ").strip().encode("utf-8")

I am trying to train a chatbot by running a file called 'train.py'. I use the command 'sudo python train.py' and my current python version in Ubuntu is version 3.6. On my local MacOS the exact same code seems to be working fine, but i am running Python version 2.7 on it.

  • This error produced by Python 2.7 (`.../usr/lib/python2.7/encodings/utf_8.py...`). Check `python --version` and try to use `python3` command instead. – Stanislav Ivanov Mar 29 '18 at 13:37
  • try decoding as ''unicode-escape''. ex. format_text(symbols).decode("unicode-escape") – py-D Mar 29 '18 at 14:03

1 Answers1

1

Try out this:

def format_spm_text(symbols): 
  return u"".join(format_text(symbols).decode("unicode-escape").split()).replace(
  u"\u2581", u" ").strip().encode("utf-8")
py-D
  • 661
  • 5
  • 8
  • I tried using "unicode-escape", but I still got the Unicode Encode error. Also, when I tried to run the command "sudo python3 train.py" I got the "illegal instruction: core dumped" bug. – Ole Martin T Vad Mar 29 '18 at 16:41
  • On MacOS this code seems to work fine, but on Ubuntu Virtual Desktop i am having a lot of issues with this code. I should mention that I use Python 2 on MacOS and Python 3.6 on Ubuntu. – Ole Martin T Vad Mar 29 '18 at 16:47