Adding cookie data into a requests.urlretrieve

Question

i am trying to download a .torrent file from a password protected site. I have managed to get to the site using cookies like so:

cookies = {'uid': '232323', 'pass': '31321231jh12j3hj213hj213hk',
           '__cfduid': 'kj123kj21kj31k23jkl21j321j3kl213kl21j3'}
    try:
        # read site content
        read = requests.get(s_string, cookies=cookies).content
    except RequestException as e:
        raise print('Could not connect to somesite: %s' % e)

    soup = BeautifulSoup(read, 'html.parser')

With this above code i get access to the site and scrape the data i need. With the scraped data i build a link to a .torrent file, which i then want to download, but this is where i am stuck.

Here is what im trying right now: (cookie data not real obviously, like its not in above code either)

cookies = {'uid': '232323', 'pass': '31321231jh12j3hj213hj213hk',
               '__cfduid': 'kj123kj21kj31k23jkl21j321j3kl213kl21j3'}

# construct download URL
torrent_url = ('https://www.somesite.com/' + torrent_url)
# for testing purposes DELETE!
print('torrent link:', torrent_url)

# download torrent file into a folder
filename = torrent_url.split('/')[-1]
save_as = 'torrents/' + filename + '.torrent'

try:
    r = request.urlretrieve(torrent_url, save_as, data=cookies)
    print("Download successful for: " + filename)
except request.URLError as e:
        raise print("Error :%s" % e)

This code would work without the cookies on a normal site, but this .torrent file im trying to get is behind a passworded/captchaed site, so i need to use cookies to scrape it.

So question is, what am i doing wrong here? without data=cookies i get http 404 error and with the data=cookies i get the following error:

File "/usr/lib/python3.6/http/client.py", line 1064, in _send_output
+ b'\r\n'
TypeError: can't concat str to bytes </error>

ps. before anyone asks, yes im 100% sure the torrent_url is correct, i have it printed and manually copy/pasting it into my own browser promps the download window for the .torrent file in question

EDIT:

try:
   read = requests.session().get(torrent_url)
   with open(save_as, 'wb') as w:
       for chunk in read.iter_content(chunk_size=1024):
           if chunk:
               w.write(chunk)
           w.close()
           print("Download successful for: " + filename)
 except request.URLError as e:
     print("Error :%s" % e)

made this based on furas's suggestion, it works now, but when i try to open the .torrent, torrent client says "invalid coding, cannot open".

When i open the .torrent file, inside is this:

<h1>Not Found</h1>
<p>Sorry pal :(</p>
<script src="/cdn-cgi/apps/head/o1wasdM-xsd3-9gm7FQY.js"></script>

am i still doing something wrong or has this something to do with the site owner preventing programs from downloading .torrents from his site or something of that nature?

you could download using `requests` with `stream=True` and if you use `requests.Session()` then you don't have to copy cookies. — furas, Jan 02 '18 at 14:27
btw; cookies are sends as `header` but `data` can be send in body. — furas, Jan 02 '18 at 14:32
btw: `print()` can use many arguments so you can do `print("Error:", e)` . And `print()` send text on screen but it returns `None` so finally you have `raise None`. Maybe you need `raise "Error :%s" % e` — furas, Jan 02 '18 at 14:44
im still getting 404 error using the code edited in main post, guess i did something wrong with the session() — Nanoni, Jan 02 '18 at 16:09
Ok, i got it to work, i couldnt get it to work with session(), i just resend cookies when downloading torrents, added new code to answer — Nanoni, Jan 02 '18 at 16:39

Nanoni · Accepted Answer · 2018-01-02T17:38:57.380

1

This works, but not ideal i think.

cookies = {'uid': '232323', 'pass': '31321231jh12j3hj213hj213hk',
           '__cfduid': 'kj123kj21kj31k23jkl21j321j3kl213kl21j3'}

try:
    read = requests.get(torrent_url, cookies=cookies)
    with open(save_as, 'wb') as w:
        for chunk in read.iter_content(chunk_size=512):
            if chunk:
                w.write(chunk)
            print(filename + ' downloaded successfully!!!')
except request.URLError as e:
    print("Error :%s" % e)

edited Jan 02 '18 at 17:38

answered Jan 02 '18 at 16:41

Nanoni

451
2
7
20

you have wrong indentions - `w.close()` shouldn't be inside `for` loop - it closes file after first chunk of data. Besides, when you use `with` then you don't need `w.close()` because `with` will close it automatically (even if you get exception) – furas Jan 02 '18 at 16:54
ofcourse, i never use close, always with, copied that bit from the interwebs, my mistake :) – Nanoni Jan 02 '18 at 17:40
is this working without `close()` ? You may try to use `stream=True` in `get()`. In `read` you should have header `"Content-Length"` with file size so you can check if you downloaded all data. – furas Jan 02 '18 at 17:51
Its working as it is, kinda slow though, ill work on the code a little bit tommorrow and try get it to work faster. – Nanoni Jan 02 '18 at 22:46

Adding cookie data into a requests.urlretrieve

1 Answers1