-1

Background: I use the request.get in python 2.7 and then save the request content no matter binary or text. After changed to python 3.7, it needed to use the request text to save text. Unfortunately, I save the binary (jpg file) using request text too.

I have studied this: "What is the difference between 'content' and 'text'" in stackoverflow. Is there a way for me to convert the request text to content (or binary jpg format) from the file?

Simply read the file and processed to binary, it failed.

My method was:

  html = requests.get(url, headers={"User-Agent":self.user_agents[agent]})
  html.encoding = html.apparent_encoding
  with open(fname, "w") as fh:
    fh.write(html.text)
    fh.close()

Many thanks

Supplementary: The html.text already saved to file. The problem is how to convert or decode back the file (not the request.get) to binary.

Rocky
  • 1
  • 2
  • 1
    You will need to do e.g. `jpg_bytes = requests.get('http://example.com/image.jpg').content` for any binary data - `.text` is simply only for human readable text. – metatoaster Jul 12 '20 at 10:02
  • You are using `with`, therefore no need to call close on the file handle. – user69453 Jul 12 '20 at 10:04
  • @metatoaster, sorry my unclear question. I can save the binary content now, but the previous file is using the .text human readable text. how to convert (decode back) the text format to binary? – Rocky Jul 14 '20 at 00:58
  • Just open the original file that you saved using binary mode, as in the answer, will return `bytes`. Otherwise open it without the binary mode will have the `read` method attempt to convert with the default codec to a `str`, which will succeed if whatever that was read was decoded properly. – metatoaster Jul 14 '20 at 02:41
  • @andrea-blengino, thanks for formatting the code – Rocky Jul 14 '20 at 03:35
  • @metatoaster Thanks a lot. I have tried the code: `with open(fn, "rb") as oh: b = oh.read(), with open(fn2, "wb") as bh: bh.write(b)` but file both same. I found that the header is "ef bf bd ef bf bd ef bf...." etc. The file is JFIF. – Rocky Jul 15 '20 at 14:17

1 Answers1

0

If you want to write the content of a binary file, I suggest :

In the end, this should work:

wtih open(fname, "wb") as fh:
    fh.write(html.content)
mjacq
  • 1
  • Thanks a lot. I have use wb and html.content to save now. But the previous one was the html.text. I have problem to convert (decode back) the file to binary. Do you have any idea on this? – Rocky Jul 14 '20 at 00:57