python convert request text to content from file

Question

Background: I use the request.get in python 2.7 and then save the request content no matter binary or text. After changed to python 3.7, it needed to use the request text to save text. Unfortunately, I save the binary (jpg file) using request text too.

I have studied this: "What is the difference between 'content' and 'text'" in stackoverflow. Is there a way for me to convert the request text to content (or binary jpg format) from the file?

Simply read the file and processed to binary, it failed.

My method was:

  html = requests.get(url, headers={"User-Agent":self.user_agents[agent]})
  html.encoding = html.apparent_encoding
  with open(fname, "w") as fh:
    fh.write(html.text)
    fh.close()

Many thanks

Supplementary: The html.text already saved to file. The problem is how to convert or decode back the file (not the request.get) to binary.

You will need to do e.g. `jpg_bytes = requests.get('http://example.com/image.jpg').content` for any binary data - `.text` is simply only for human readable text. — metatoaster, Jul 12 '20 at 10:02
You are using `with`, therefore no need to call close on the file handle. — user69453, Jul 12 '20 at 10:04
@metatoaster, sorry my unclear question. I can save the binary content now, but the previous file is using the .text human readable text. how to convert (decode back) the text format to binary? — Rocky, Jul 14 '20 at 00:58
Just open the original file that you saved using binary mode, as in the answer, will return `bytes`. Otherwise open it without the binary mode will have the `read` method attempt to convert with the default codec to a `str`, which will succeed if whatever that was read was decoded properly. — metatoaster, Jul 14 '20 at 02:41
@metatoaster Thanks a lot. I have tried the code: `with open(fn, "rb") as oh: b = oh.read(), with open(fn2, "wb") as bh: bh.write(b)` but file both same. I found that the header is "ef bf bd ef bf bd ef bf...." etc. The file is JFIF. — Rocky, Jul 15 '20 at 14:17

score 0 · Answer 1 · answered Jul 12 '20 at 10:13

0

If you want to write the content of a binary file, I suggest :

you use the flag b to open in binary mode
you use .content instead of text to extract from your request (What is the difference between 'content' and 'text')
and no need to close file when using a context manager (with)

In the end, this should work:

wtih open(fname, "wb") as fh:
    fh.write(html.content)

answered Jul 12 '20 at 10:13

mjacq

1

Thanks a lot. I have use wb and html.content to save now. But the previous one was the html.text. I have problem to convert (decode back) the file to binary. Do you have any idea on this? – Rocky Jul 14 '20 at 00:57

python convert request text to content from file

1 Answers1