1

i am trying to download a .torrent file from a password protected site. I have managed to get to the site using cookies like so:

cookies = {'uid': '232323', 'pass': '31321231jh12j3hj213hj213hk',
           '__cfduid': 'kj123kj21kj31k23jkl21j321j3kl213kl21j3'}
    try:
        # read site content
        read = requests.get(s_string, cookies=cookies).content
    except RequestException as e:
        raise print('Could not connect to somesite: %s' % e)

    soup = BeautifulSoup(read, 'html.parser')

With this above code i get access to the site and scrape the data i need. With the scraped data i build a link to a .torrent file, which i then want to download, but this is where i am stuck.

Here is what im trying right now: (cookie data not real obviously, like its not in above code either)

cookies = {'uid': '232323', 'pass': '31321231jh12j3hj213hj213hk',
               '__cfduid': 'kj123kj21kj31k23jkl21j321j3kl213kl21j3'}

# construct download URL
torrent_url = ('https://www.somesite.com/' + torrent_url)
# for testing purposes DELETE!
print('torrent link:', torrent_url)

# download torrent file into a folder
filename = torrent_url.split('/')[-1]
save_as = 'torrents/' + filename + '.torrent'

try:
    r = request.urlretrieve(torrent_url, save_as, data=cookies)
    print("Download successful for: " + filename)
except request.URLError as e:
        raise print("Error :%s" % e)

This code would work without the cookies on a normal site, but this .torrent file im trying to get is behind a passworded/captchaed site, so i need to use cookies to scrape it.

So question is, what am i doing wrong here? without data=cookies i get http 404 error and with the data=cookies i get the following error:

File "/usr/lib/python3.6/http/client.py", line 1064, in _send_output
+ b'\r\n'
TypeError: can't concat str to bytes </error>

ps. before anyone asks, yes im 100% sure the torrent_url is correct, i have it printed and manually copy/pasting it into my own browser promps the download window for the .torrent file in question

EDIT:

try:
   read = requests.session().get(torrent_url)
   with open(save_as, 'wb') as w:
       for chunk in read.iter_content(chunk_size=1024):
           if chunk:
               w.write(chunk)
           w.close()
           print("Download successful for: " + filename)
 except request.URLError as e:
     print("Error :%s" % e)

made this based on furas's suggestion, it works now, but when i try to open the .torrent, torrent client says "invalid coding, cannot open".

When i open the .torrent file, inside is this:

<h1>Not Found</h1>
<p>Sorry pal :(</p>
<script src="/cdn-cgi/apps/head/o1wasdM-xsd3-9gm7FQY.js"></script>

am i still doing something wrong or has this something to do with the site owner preventing programs from downloading .torrents from his site or something of that nature?

Nanoni
  • 451
  • 2
  • 7
  • 20
  • you could download using `requests` with `stream=True` and if you use `requests.Session()` then you don't have to copy cookies. – furas Jan 02 '18 at 14:27
  • btw; cookies are sends as `header` but `data` can be send in body. – furas Jan 02 '18 at 14:32
  • btw: `print()` can use many arguments so you can do `print("Error:", e)` . And `print()` send text on screen but it returns `None` so finally you have `raise None`. Maybe you need `raise "Error :%s" % e` – furas Jan 02 '18 at 14:44
  • im still getting 404 error using the code edited in main post, guess i did something wrong with the session() – Nanoni Jan 02 '18 at 16:09
  • Ok, i got it to work, i couldnt get it to work with session(), i just resend cookies when downloading torrents, added new code to answer – Nanoni Jan 02 '18 at 16:39

1 Answers1

1

This works, but not ideal i think.

cookies = {'uid': '232323', 'pass': '31321231jh12j3hj213hj213hk',
           '__cfduid': 'kj123kj21kj31k23jkl21j321j3kl213kl21j3'}

try:
    read = requests.get(torrent_url, cookies=cookies)
    with open(save_as, 'wb') as w:
        for chunk in read.iter_content(chunk_size=512):
            if chunk:
                w.write(chunk)
            print(filename + ' downloaded successfully!!!')
except request.URLError as e:
    print("Error :%s" % e)
Nanoni
  • 451
  • 2
  • 7
  • 20
  • you have wrong indentions - `w.close()` shouldn't be inside `for` loop - it closes file after first chunk of data. Besides, when you use `with` then you don't need `w.close()` because `with` will close it automatically (even if you get exception) – furas Jan 02 '18 at 16:54
  • ofcourse, i never use close, always with, copied that bit from the interwebs, my mistake :) – Nanoni Jan 02 '18 at 17:40
  • is this working without `close()` ? You may try to use `stream=True` in `get()`. In `read` you should have header `"Content-Length"` with file size so you can check if you downloaded all data. – furas Jan 02 '18 at 17:51
  • Its working as it is, kinda slow though, ill work on the code a little bit tommorrow and try get it to work faster. – Nanoni Jan 02 '18 at 22:46