Questions tagged [urllib]

Python module providing a high-level interface for fetching data across the World Wide Web. Predecessor to urllib2. In Python 3, urllib2 and urllib have been reorganized and merged into urllib.

For Python 2, the urllib module is the predecessor of the urllib2 module, while the latter still uses some functionality of the former.

For Python 3, the urllib package was reorganized. It now has no content of its own. All methods and classes are in several submodules:

Note that doesn't exist in Python 3 anymore.

3960 questions
129
votes
6 answers

urllib2.HTTPError: HTTP Error 403: Forbidden

I am trying to automate download of historic stock data using python. The URL I am trying to open responds with a CSV file, but I am unable to open using urllib2. I have tried changing user agent as specified in few questions earlier, I even tried…
kumar
  • 2,570
  • 2
  • 17
  • 18
128
votes
14 answers

Only add to a dict if a condition is met

I am using urllib.urlencode to build web POST parameters, however there are a few values I only want to be added if a value other than None exists for them. apple = 'green' orange = 'orange' params = urllib.urlencode({ 'apple': apple, …
user1814016
  • 2,273
  • 5
  • 25
  • 28
116
votes
10 answers

python save image from url

I got a problem when I am using python to save an image from url either by urllib2 request or urllib.urlretrieve. That is the url of the image is valid. I could download it manually using the explorer. However, when I use python to download the…
Shaoxiang Su
  • 1,191
  • 2
  • 8
  • 7
105
votes
3 answers

AttributeError: 'module' object has no attribute 'urlretrieve'

I am trying to write a program that will download mp3's off of a website then join them together but whenever I try to download the files I get this error: Traceback (most recent call last): File "/home/tesla/PycharmProjects/OldSpice/Voicemail.py",…
Sike1217
  • 1,065
  • 2
  • 8
  • 5
102
votes
4 answers

How do I set HTTP headers using Python's urllib?

I am pretty new to Python's urllib. What I need to do is set a custom HTTP header for the request being sent to the server. Specifically, I need to set the Content-Type and Authorization HTTP headers. I have looked into the Python documentation,…
ewok
  • 20,148
  • 51
  • 149
  • 254
102
votes
11 answers

SSL: CERTIFICATE_VERIFY_FAILED with Python3

I apologize if this is a silly question, but I have been trying to teach myself how to use BeautifulSoup so that I can create a few projects. I was following this link as a tutorial: https://www.youtube.com/watch?v=5GzVNi0oTxQ After following the…
PafflesWancakes
  • 1,121
  • 2
  • 8
  • 5
97
votes
10 answers

Django: add image in an ImageField from image url

please excuse me for my ugly english ;-) Imagine this very simple model : class Photo(models.Model): image = models.ImageField('Label', upload_to='path/') I would like to create a Photo from an image URL (i.e., not by hand in the django admin…
user166648
96
votes
4 answers

How do I catch a specific HTTP error in Python?

I have import urllib2 try: urllib2.urlopen("some url") except urllib2.HTTPError: but what I end up is catching any kind of HTTP error. I want to catch only if the specified webpage doesn't exist (404?).
Arnab Sen Gupta
  • 5,639
  • 5
  • 24
  • 17
92
votes
3 answers

Python, opposite function urllib.urlencode

How can I convert data after processing urllib.urlencode to dict? urllib.urldecode does not exist.
Artyom
  • 2,863
  • 3
  • 20
  • 15
92
votes
12 answers

How to extract a filename from a URL and append a word to it?

I have the following URL: url = http://photographs.500px.com/kyle/09-09-201315-47-571378756077.jpg I would like to extract the file name in this URL: 09-09-201315-47-571378756077.jpg Once I get this file name, I'm going to save it with this name to…
deadlock
  • 7,048
  • 14
  • 67
  • 115
88
votes
4 answers

Changing User Agent in Python 3 for urrlib.request.urlopen

I want to open a url using urllib.request.urlopen('someurl'): with urllib.request.urlopen('someurl') as url: b = url.read() I keep getting the following error: urllib.error.HTTPError: HTTP Error 403: Forbidden I understand the error to be due to…
user3662991
  • 1,083
  • 1
  • 11
  • 11
83
votes
4 answers

Python3 error: initial_value must be str or None, with StringIO

While porting code from python2 to 3, I get this error when reading from a URL TypeError: initial_value must be str or None, not bytes. import urllib import json import gzip from urllib.parse import urlencode from urllib.request import…
AMisra
  • 1,869
  • 2
  • 25
  • 45
75
votes
5 answers

should I call close() after urllib.urlopen()?

I'm new to Python and reading someone else's code: should urllib.urlopen() be followed by urllib.close()? Otherwise, one would leak connections, correct?
Nikita
  • 6,019
  • 8
  • 45
  • 54
73
votes
12 answers

no module named urllib.parse (How should I install it?)

I'm trying to run a REST API on CentOS 7, I read urllib.parse is in Python 3 but I'm using Python 2.7.5 so I don't know how to install this module. I installed all the requirements but still can't run the project. When I'm looking for a URL I get…
javiercruzweb
  • 855
  • 1
  • 6
  • 7
72
votes
3 answers

Overriding urllib2.HTTPError or urllib.error.HTTPError and reading response HTML anyway

I receive a 'HTTP Error 500: Internal Server Error' response, but I still want to read the data inside the error HTML. With Python 2.6, I normally fetch a page using: import urllib2 url = "http://google.com" data = urllib2.urlopen(url) data =…
backus
  • 4,076
  • 5
  • 28
  • 30