1

I am using hyperas module to tune my Keras model and return the error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 4785: ordinal not in range(128)

The error occurred at calling place, the syntax of trials:

if __name__ == '__main__':
    best_run, best_model = optim.minimize(model=create_model,
                                      data=data,
                                      algo=tpe.suggest,
                                      max_evals=20,
                                      trials=Trials())

and I think the origin of the problem is due to my loaded numpy .npy file which is a ascii encoding format data. So, how can I change the ascii format to utf-8 format?

I saw some solution like this by adding the encoding='latin1' but it doesn't work.

label =np.load(os.getcwd()+'/Simu_Sample_label_1000.npy',encoding="latin1")
sample=np.load(os.getcwd()+'/Training_Sample_1000.npy',encoding="latin1")

Add my whole traceback here:

    In [3]: %run 1dCNN.py
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
~/subg_ps/cnn_train/1dCNN.py in <module>()
    127                                           algo=tpe.suggest,
    128                                           max_evals=20,
--> 129                                           trials=Trials())
    130     trX, trY, teX, teY = data()
    131     print("Evalutation of best performing model:")

~/anaconda3/lib/python3.6/site-packages/hyperas/optim.py in minimize(model, data, algo, max_evals, trials, functions, rseed, notebook_name, verbose, eval_space, return_space, keep_temp)
     67                                      notebook_name=notebook_name,
     68                                      verbose=verbose,
---> 69                                      keep_temp=keep_temp)
     70
     71     best_model = None

~/anaconda3/lib/python3.6/site-packages/hyperas/optim.py in base_minimizer(model, data, functions, algo, max_evals, trials, rseed, full_model_string, notebook_name, verbose, stack, keep_temp)
     96         model_str = full_model_string
     97     else:
---> 98         model_str = get_hyperopt_model_string(model, data, functions, notebook_name, verbose, stack)
     99     temp_file = './temp_model.py'
    100     write_temp_files(model_str, temp_file)

~/anaconda3/lib/python3.6/site-packages/hyperas/optim.py in get_hyperopt_model_string(model, data, functions, notebook_name, verbose, stack)
    184         calling_script_file = os.path.abspath(inspect.stack()[stack][1])
    185         with open(calling_script_file, 'r') as f:
--> 186             source = f.read()
    187
    188     cleaned_source = remove_all_comments(source)

~/anaconda3/lib/python3.6/encodings/ascii.py in decode(self, input, final)
     24 class IncrementalDecoder(codecs.IncrementalDecoder):
     25     def decode(self, input, final=False):
---> 26         return codecs.ascii_decode(input, self.errors)[0]
     27
     28 class StreamWriter(Codec,codecs.StreamWriter):

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 4785: ordinal not in range(128)

I think I'd better put all traceback here, and all the code as follow: https://github.com/MinghaoDu1994/MyPythonFunctions/blob/master/1Dcnn

I think that the problem is due to the funciton Trials in hyperopt, but I don't find any related question like mine.

Minghao
  • 31
  • 4
  • From the error and its traceback do you know which array has the problem? What's the connection between the `label` and `sample` arrays and the model call? I don't see those variables. What's the source of those `npy` files. Note that the `encoding` parameter has a very limited applicability (read the docs). – hpaulj Mar 19 '19 at 17:40
  • No, the traceback just tells me the error comes from the first quoted code and `label` and `sample` consist the `data=data` parameter. I searched for the solution for this error and inferred that it due to the data I input. My `npy` files are produced from my other program by python3 and therefore, I think it should not have this error here. I have read the docs and just try the parameter `encoding` here in case it works, however, it didn't. – Minghao Mar 20 '19 at 01:08

2 Answers2

1

The problem has been solved. When calling the optim.minimize function, we must first define two functions named as data and model, rather than what I named create_model or anything else. It is a very strict limitation.

Minghao
  • 31
  • 4
0

I can recreate your error by converting a unicode string (PY3 default) to bytestring, and then trying to decode it:

In [347]: astr = 'abc'+chr(0xe8)+'xyz'                                                    
In [348]: astr                                                                            
Out[348]: 'abcèxyz'
In [349]: astr.encode('latin1')                                                           
Out[349]: b'abc\xe8xyz'
In [350]: astr.encode('latin1').decode('ascii')                                           
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-350-1825a76f5d5b> in <module>
----> 1 astr.encode('latin1').decode('ascii')

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 3: ordinal not in range(128)

hyperas reading some sort of script file in get_hyperopt_model_string(). I can't tell what variable is controlling this read, maybe it's the notebook. I don't think the arrays that you loaded from npy files have anything to do with this problem. It's decoding a large string (position 4785), not some element of an array.

In short, this is a hyperas model problem, not a npy file one.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • I can't find any related parameters controlling the read, maybe I'd better change the description of the question. Thank you a lot. – Minghao Mar 20 '19 at 03:28