How can I download a binary file using Python and WinHTTPRequest?

Question

I need to download a bunch of pdf files from the web. I usually use the urllib3 library, but it is a corporate website with authentication. I can download a normal html web using the following:

url = 'https://corpweb.example/index.html'
h = win32com.client.Dispatch('WinHTTP.WinHTTPRequest.5.1')
h.SetAutoLogonPolicy(0)
h.Open('GET', url, False)
h.Send()
result = h.responseText

But this solution doesn't works with a PDF.

url = "https://corpweb.example/file.pdf"
h = win32com.client.Dispatch('WinHTTP.WinHTTPRequest.5.1')
h.SetAutoLogonPolicy(0)
h.Open('GET', url, False)
h.Send()
with open(filename, 'wb') as f:
    f.write(h.responseText)

I get an error:

TypeError: a bytes-like object is required, not 'str'

What can I do?

user3840170 · Answer 1 · 2021-03-26T10:03:40.867

1

As Microsoft’s documentation of WinHttpRequest explains, responseText contains the response body as Unicode text. To obtain the response body as raw bytes, use responseBody instead.

Also consider using responseStream instead of either, to avoid keeping the entire file in memory at once.

edited Mar 26 '21 at 10:03

answered Mar 26 '21 at 08:54

user3840170

26,597
4
30
62

score -1 · Answer 2 · answered Mar 26 '21 at 09:03

Try using urllib.request.urlretrieve(url, filepath)?

import urllib.request as url
url="https://corpweb/file.pdf"
url.urlretrieve(url, "file.pdf")

It may be the best solution. Or you can use requests:

import requests
import os
url="https://corpweb/file.pdf"
resp = requests.get(url) # Get the response
os.system("type nul > file.pdf") # Create a new file
f = open("file.pdf", "wb") # Open file
f.write(resp.content) # Write
f.close() # Close file

score -2 · Answer 3 · edited Mar 26 '21 at 08:43

-2

Open file Mode :

with open(fname, 'rb') as f:
    ...

This means that all data read from the file is returned as bytes objects, not str. You cannot then use a string in a containment test:

if 'some-pattern' in tmp:
    continue

edited Mar 26 '21 at 08:43

user3840170

26,597
4
30
62

answered Mar 26 '21 at 08:39

Mohammad Taghdir

9
1

I don't want to read from the file. I want to save the pdf from the web server to my local drive. – Javier Alvarez Mar 26 '21 at 08:43

How can I download a binary file using Python and WinHTTPRequest?

3 Answers3