0

I have a pandas dataframe. The 'data' column contains bytes bojects (binary files)

df = pd.DataFrame({'file_hash' : [01ccba93f3647ca50..., 739b24dc0dfea....], 'data' : [b'x\x9cd\xbbuT\x1c\xc1\xb7-\xdc\xf8 A\x12\xdc..., b'x\x9c\xcc\xbaeTT\xdf\x1b?z\x08\t\xa5A%$\x15a...]})

Now, I am sending this through a http server

bytes_obj = zlib.compress((output.to_csv(index=False)).encode())
self.wfile.write(bytes_obj)

While I am able to the dataframe at client side,

response = requests.get(url)
response_bytes = response.content
response_dataframe = pd.read_csv(io.BytesIO(zlib.decompress(response_bytes)))

The bytes object is now strings like "b'x\x9cd\xbbuT\x1c\xc1\xb7-...". If I convert these strings, the become like b'b\'x\\x9cd\\xbbuT\\x1c\\xc1\\xb7

I tried many ways but just cannot get back the the exact bytes objects. I would really appreciate some suggestions.

Thanks

Sam11
  • 63
  • 6

1 Answers1

0

It feels stupid but I think I should mention the solution as it can save someone's time.

I just used ast.literal_eval function from the ast module.

Basically, liternal_eval converts a string which looks like a bytes object to real bytesobject. So, string"b'x\x9cd\xbbuT\x1c\xc1\xb7-..."becomes bytes object b'x\x9cd\xbbuT\x1c\xc1\xb7-...

In my case,

response_dataframe["data"] = response_dataframe["data"].apply(lambda x: ast.literal_eval(x))

and not the data column containts the bytes objects I needed.

Sam11
  • 63
  • 6