2

I am simply reading a file in binary mode into a buffer, performing some replacements on that buffer and after that inserting that buffered data into MySQL database. Every byte got inserted regardless of any encoding standard. I used python 2.7, it worked well.

CODE :

with open(binfile,'rb') as fd_bin:
   bin_data = fd_bin.read()
   bin_data = bin_data.replace('"','\\"')
   db_cursor.execute("INSERT INTO table BIN_DATA values {}".format(bin_data))

When I used python 3.4 version, it needed to be decoded so I used :

   bin_data = fd_bin.read()
   bin_data = bin_data.decode('utf-8') # error at this line

The mentioned second line produced error:

bindata = bindata.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 1:invalid
start byte

I used latin-1 and iso-8859-1 decoding scheme but they insert some extra byte at some places. when I fetch data from database, data is not same in this case but it was for python 2.7 version.

How can I insert that data or decode that data regardless of encoding scheme ? I can't skip or ignore the bytes.

Essex
  • 6,042
  • 11
  • 67
  • 139
  • 2
    Possible duplicate of [unicode().decode('utf-8', 'ignore') raising UnicodeEncodeError](http://stackoverflow.com/questions/5096776/unicode-decodeutf-8-ignore-raising-unicodeencodeerror) – linusg Nov 25 '16 at 12:50
  • Honestly, this question was around on StackOverflow so many, many, many times. – linusg Nov 25 '16 at 12:50

0 Answers0