Questions tagged [urlopen]

The urlopen is a method of the urllib library in Python, used to open a particular URL.

The urlopen is a method of the urllib library in Python, used to open a particular URL. As a result, a file-like object is returned that contains information about the URL - headers, response data and other details about the requested URL resource.

369 questions
3
votes
1 answer

python mechanize javascript submit button problem!

im making some script with mechanize.browser module. one of problem is all other thing is ok, but when submit() form,it not working, so i was found some suspicion source part. in the html source i was found such like following. im thinking,…
paul
  • 327
  • 1
  • 7
  • 24
3
votes
2 answers

Detecting timeout erros in Python's urllib2 urlopen

I'm still relatively new to Python, so if this is an obvious question, I apologize. My question is in regard to the urllib2 library, and it's urlopen function. Currently I'm using this to load a large amount of pages from another server (they are…
Parker
  • 8,539
  • 10
  • 69
  • 98
3
votes
5 answers

Caching options in Python or speeding up urlopen

Hey all, I have a site that looks up info for the end user, is written in Python, and requires several urlopen commands. As a result it takes a bit for a page to load. I was wondering if there was a way to make it faster? Is there an easy Python way…
Jill S
  • 93
  • 1
  • 1
  • 3
3
votes
2 answers

How to Return to the First Line in a Urlopen Object

I am iterating a .dat file save on a http website using import urllib2 test_file = urllib2.urlopen('http://~/file.dat') And then, I have a function which iterates the file def f(file): while True: iter = file.readline() if iter…
Xuyan Xiao
  • 81
  • 5
3
votes
1 answer

Python: urlopen - skip entry if any error occurs

I was wondering if there was some sort of "Catch all" code for urlopen that would skip an entire entry in my for loop should any error in accessing the website occur.
3
votes
1 answer

Preventing a "hidden" redirect with urlopen() in Python

I am using BeautifulSoup for web scraping and I am having problems with a particular type of website when using urlopen. Every item on the website has its own unique page and the item comes in different formats (ex: 500 mL, 1L, 2L,...). When I open…
LaGuille
  • 1,658
  • 5
  • 20
  • 37
3
votes
0 answers

getting http.client.BadStatusLine with urlopen(IP).read()

The data I am trying to read is in xml format. There is a single space before the xml declaration. I can not edit this part as it is hard coded into the data source. I can only read from it. When the url is entered in IE the data comes up. When…
mad5245
  • 394
  • 3
  • 8
  • 20
3
votes
2 answers

urllib2.openurl not working on Google Patents

I'm trying to scrape some data from google patents, and the beginning of my code looks like this: (here is the hyperlink to the url listed below) In [1]: import urllib2 In [2]:…
Chris
  • 9,603
  • 15
  • 46
  • 67
3
votes
1 answer

Error from urlopen "code for hash not found" on linux

I've tried a couple of searches and I don't think this has been asked, but if this is a duplicate please forgive me. I'm trying to use urllib on python-2.7 to read from a web page. Very simple application, all I want to do is get some text from a…
Magic_Matt_Man
  • 2,020
  • 3
  • 16
  • 16
2
votes
2 answers

How to request a url with non-unicode carachters on main domainname (not params) in Python?

I cannot request url "http://www.besondere-raumdüfte.de" with urllib2.urlopen(). I tried to encode string using urllib.urlencode with utf-8, idna, ascii But still doesn't work. Raises URLError:
2
votes
1 answer

Python - character encoding and decoding problems

I have got 1 source file with utf-8 characters (names) I have got 1 out file with same character encoding. I am working with a html page, paste and cut the useful information for me to out file. I use "éáűúőóüöäđĐ' characters in my…
user1292883
  • 29
  • 1
  • 4
2
votes
1 answer

How do i set a header that prevents the site from sending a gzip encoded response

i am using python urllib2.urlopen to get html content and i am getting a gziped response. can i set the headers so i will get it not zipped ? my code response = urlopen(url,None , TIMEOUT) html = response.read() # read html print html as…
yossi
  • 12,945
  • 28
  • 84
  • 110
2
votes
1 answer

urlopen call with timeout not terminating after timeout

In Python 2.4.4, I'm using urllib2.urlopen() to request a resource. Before making the request, I'm setting a timeout with: socket.setdefaulttimeout(10) (This version of Python is too old to have a version of urlopen() with built-in timeout.) In…
jrdioko
  • 32,230
  • 28
  • 81
  • 120
2
votes
2 answers

Why does text retrieved from pages sometimes look like gibberish?

I'm using urllib and urllib2 in Python to open and read webpages but sometimes, the text I get is unreadable. For example, if I run this: import urllib text = urllib.urlopen('http://tagger.steve.museum/steve/object/141913').read() print text I get…
Siato
  • 355
  • 1
  • 5
  • 13
2
votes
3 answers

Gibberish from urlopen

I am trying to read some utf-8 files from the addresses in the code below. It works for most of them, but for some files the urllib2 (and urllib) is unable to read. The obvious answer here is that the second file is corrupt, but the strange thing…