3

I'm trying to download an MP3 file, via its URL, using Python's urllib2.

mp3file = urllib2.urlopen(url)
output = open(dst,'wb')
output.write(mp3file.read())
output.close()

I'm getting a urllib2.HTTPError: HTTP Error 403: Forbidden error. Trying urllib also fails, but silently.

urllib.urlretrieve(url, dst)

However, if I use wget, I can download the file successfully.

I've noted the general differences between the two methods mentioned in "Difference between Python urllib.urlretrieve() and wget", but they don't seem to apply here.

Is wget doing something to negotiate permissions that urllib2 doesn't do? If so, what, and how do I replicate this in urllib2?

Community
  • 1
  • 1
Richard Horrocks
  • 419
  • 3
  • 19
  • 1
    That is completely dependent on the server, have you tried `wget --verbose` to see what's happening? – Jasper Apr 16 '14 at 12:14
  • Have you tried adding headers: http://stackoverflow.com/questions/13303449/urllib2-httperror-http-error-403-forbidden – etna Apr 16 '14 at 12:17
  • It appears `wget`'s default output level is verbose, so it's not giving me anything extra when the flag is given explicitly. I'll try playing around with the headers... – Richard Horrocks Apr 16 '14 at 19:34

2 Answers2

1

Could be something on the server side - blocking python user agent for example. Try using wget user agent : Wget/1.13.4 (linux-gnu) .

In Python 2:

import urllib

# Change header for User-Agent
class AppURLopener(urllib.FancyURLopener):
    version = "Wget/1.13.4 (linux-gnu)"
url = "http://www.example.com/test_file"
fname = "test_file"
urllib._urlopener = AppURLopener()
urllib.urlretrieve(url, fname)
VirtualScooter
  • 1,792
  • 3
  • 18
  • 28
WeaselFox
  • 7,220
  • 8
  • 44
  • 75
0

The above didn't work for me (I'm using python3.5). wget works fine.

It's not (I assume) a huge problem for me - surely I can still do a system() and use wget to get the data, with some file renaming and munging.

But in case anyone else is suffering from the same problem, these are the errors I get from the above snippet:

Traceback (most recent call last):
  File "./mksynt.py", line 10, in <module>
    class AppURLopener(urllib.FancyURLopener):
AttributeError: module 'urllib' has no attribute 'FancyURLopener'

I see that the original answer was only promised to work in python2.