I am new to using google colaboratory (colab) and pydrive along with it. I am trying to load data in 'CAS_num_strings' which was written in a pickle file in a specific directory on my google drive using colab as:
pickle.dump(CAS_num_strings,open('CAS_num_strings.p', 'wb'))
dump_meta = {'title': 'CAS.pkl', 'parents': [{'id':'1UEqIADV_tHic1Le0zlT25iYB7T6dBpBj'}]}
pkl_dump = drive.CreateFile(dump_meta)
pkl_dump.SetContentFile('CAS_num_strings.p')
pkl_dump.Upload()
print(pkl_dump.get('id'))
Where 'id':'1UEqIADV_tHic1Le0zlT25iYB7T6dBpBj' makes sure that it has a specific parent folder with this given by this id. The last print command gives me the output:
'1ZgZfEaKgqGnuBD40CY8zg0MCiqKmi1vH'
Hence, I am able to create and dump the pickle file whose id is '1ZgZfEaKgqGnuBD40CY8zg0MCiqKmi1vH'. Now, I want to load this pickle file in another colab script for a different purpose. In order to load, I use the command set:
cas_strings = drive.CreateFile({'id':'1ZgZfEaKgqGnuBD40CY8zg0MCiqKmi1vH'})
print('title: %s, mimeType: %s' % (cas_strings['title'], cas_strings['mimeType']))
print('Downloaded content "{}"'.format(cas_strings.GetContentString()))
This gives me the output:
title: CAS.pkl, mimeType: text/x-pascal
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-9-a80d9de0fecf> in <module>()
30 cas_strings = drive.CreateFile({'id':'1ZgZfEaKgqGnuBD40CY8zg0MCiqKmi1vH'})
31 print('title: %s, mimeType: %s' % (cas_strings['title'], cas_strings['mimeType']))
---> 32 print('Downloaded content "{}"'.format(cas_strings.GetContentString()))
33
34
/usr/local/lib/python3.6/dist-packages/pydrive/files.py in GetContentString(self, mimetype, encoding, remove_bom)
192 self.has_bom == remove_bom:
193 self.FetchContent(mimetype, remove_bom)
--> 194 return self.content.getvalue().decode(encoding)
195
196 def GetContentFile(self, filename, mimetype=None, remove_bom=False):
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
As you can see, it finds the file CAS.pkl but cannot decode the data. I want to be able to resolve this error. I understand that the normal utf-8 encoding/decoding works smoothly during normal pickle dumping and loading with the 'wb' and 'rb' options. However in the present case, after dumping I can't seem to load it from the pickle file in google drive created in the previous step. The error exists somewhere in me not being able to specify how to decode the data at "return self.content.getvalue().decode(encoding)". I can't seem to find from here (https://developers.google.com/drive/v2/reference/files#resource-representations) which keywords/metadata tags to modify. Any help is appreciated. Thanks