Questions tagged [urllib2]

urllib2 is a builtin python 2 module that defines functions and classes to help with URL actions. It is notably unsatisfactory and has been replaced in python 3 and by third-party libraries.

urllib2 is a module that defines functions and classes to help with URL actions (basic and digest authentication, redirections, cookies, etc). It supersedes urllib in Python 2, and in python 3 has been superseded by a new library called urllib.

In addition, the requests third party module has become a de facto standard to accomplish the same tasks.

2960 questions
39
votes
4 answers

How to retry urllib2.request when fails?

When urllib2.request reaches timeout, a urllib2.URLError exception is raised. What is the pythonic way to retry establishing a connection?
iTayb
  • 12,373
  • 24
  • 81
  • 135
39
votes
1 answer

Python URLLib / URLLib2 POST

I'm trying to create a super-simplistic Virtual In / Out Board using wx/Python. I've got the following code in place for one of my requests to the server where I'll be storing the data: data = urllib.urlencode({'q': 'Status'}) u =…
g.d.d.c
  • 46,865
  • 9
  • 101
  • 111
38
votes
1 answer

Login to website using urllib2 - Python 2.7

Okay, so I am using this for a reddit bot, but I want to be able to figure out HOW to log in to any website. If that makes sense.... I realise that different websites use different login forms etc. So how do I figure out how to optimise it for each…
tommo
  • 599
  • 2
  • 9
  • 14
37
votes
3 answers

Parse XML from URL into python object

The goodreads website has this API for accessing a user's 'shelves:' https://www.goodreads.com/review/list/20990068.xml?key=nGvCqaQ6tn9w4HNpW8kquw&v=2&shelf=toread It returns XML. I'm trying to create a django project that shows books on a shelf…
smilebomb
  • 5,123
  • 8
  • 49
  • 81
37
votes
3 answers

Get json data via url and use in python (simplejson)

I imagine this must have a simple answer, but I am struggling: I want to take a url (which outputs json) and get the data in a usable dictionary in python. I am stuck on the last step. >>> import urllib2 >>> import simplejson >>> req =…
thornomad
  • 6,707
  • 10
  • 53
  • 78
35
votes
3 answers

How to get the URL of a redirect with Python

In Python, I'm using urllib2 to open a url. This url redirects to another url, which redirects to yet another url. I wish to print out the url after each redirect. For example -> = redirects to A -> B -> C -> D I want to print the URL of B, C and D…
Matthew H
  • 5,831
  • 8
  • 47
  • 82
35
votes
2 answers

How to correctly parse UTF-8 encoded HTML to Unicode strings with BeautifulSoup?

I'm running a Python program which fetches a UTF-8-encoded web page, and I extract some text from the HTML using BeautifulSoup. However, when I write this text to a file (or print it on the console), it gets written in an unexpected encoding. Sample…
Christopher Orr
  • 110,418
  • 27
  • 198
  • 193
35
votes
7 answers

How to download any(!) webpage with correct charset in python?

Problem When screen-scraping a webpage using python one has to know the character encoding of the page. If you get the character encoding wrong than your output will be messed up. People usually use some rudimentary technique to detect the encoding.…
Tarnay Kálmán
  • 6,907
  • 5
  • 46
  • 57
34
votes
2 answers

How do I add a header to urllib2 opener?

cj = cookielib.CookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) opener.open('http://abc.com') opener.open('http://google.com') As you can see, I use opener to visit different websites, using a cookie jar. Can I set a…
TIMEX
  • 259,804
  • 351
  • 777
  • 1,080
33
votes
6 answers

Source interface with Python and urllib2

How do i set the source IP/interface with Python and urllib2?
Jonas Lejon
  • 3,189
  • 3
  • 28
  • 26
32
votes
2 answers

How do I download a zip file in python using urllib2?

Two part question. I am trying to download multiple archived Cory Doctorow podcasts from the internet archive. The old one's that do not come into my iTunes feed. I have written the script but the downloaded files are not properly formatted. Q1 -…
Justjoe
  • 371
  • 1
  • 4
  • 10
32
votes
2 answers

Python urllib2 URLError HTTP status code.

I want to grab the HTTP status code once it raises a URLError exception: I tried this but didn't help: except URLError, e: logger.warning( 'It seems like the server is down. Code:' + str(e.code) )
Hellnar
  • 62,315
  • 79
  • 204
  • 279
31
votes
3 answers

Read file object as string in python

I'm using urllib2 to read in a page. I need to do a quick regex on the source and pull out a few variables but urllib2 presents as a file object rather than a string. I'm new to python so I'm struggling to see how I use a file object to do this. Is…
Oli
  • 235,628
  • 64
  • 220
  • 299
31
votes
14 answers

urllib2 file name

If I open a file using urllib2, like so: remotefile = urllib2.urlopen('http://example.com/somefile.zip') Is there an easy way to get the file name other then parsing the original URL? EDIT: changed openfile to urlopen... not sure how that…
defrex
  • 15,735
  • 7
  • 34
  • 45
30
votes
6 answers

A good way to get the charset/encoding of an HTTP response in Python

Looking for an easy way to get the charset/encoding information of an HTTP response using Python urllib2, or any other Python library. >>> url = 'http://some.url.value' >>> request = urllib2.Request(url) >>> conn = urllib2.urlopen(request) >>>…
Clay Wardell
  • 14,846
  • 13
  • 44
  • 65