Use mechanize with python

Question

I'm trying to open url with mechanize but not just open it and close it right away I want it to open the url then wait 7 minute then close the url.

what I'm trying to do :

import mechanize
import cookielib
import time


url='http://google.com/'
op = mechanize.Browser()

cj = cookielib.LWPCookieJar()
op.set_handle_robots(False)
op.set_handle_equiv(True)
op.set_handle_referer(True)
op.set_handle_redirect(True)
op.set_cookiejar(cj)
op.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=7)

op.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]

op.open(url)
time.sleep(7)

print op.geturl()

but didn't work. how can I do it?

thnx.

I don't think I understand the question, then. Sleep for 7 minutes by using `time.sleep(420)`. What's not working if not that? In what way does it not work specifically with Mechanize? — a p, Apr 01 '15 at 23:03
@ap I'm trying to open url and keep me open this url for 7 min then close it. that's what i need. thank you — deounix, Apr 01 '15 at 23:05
@deounix can you share the motivation behind such an odd requirement? Thanks. — alecxe, Apr 01 '15 at 23:15
@alecxe some sites require to wait 7 second or 1 minute then show url to open it — deounix, Apr 01 '15 at 23:43

geoelectric · Accepted Answer · 2015-04-01T23:30:04.673

mechanize is a tool for performing http request/response, only with a little more ability to act like a browser than things like urllib.

http is (for the most part) stateless--you don't hold a web page open in the sense you seem to be thinking. The connection was closed by the time "open" returned.

You are retrieving Google's homepage, getting an object back from mechanize representing that response, waiting 7 seconds, and then asking for the url attached to the response.

I did run your code, and to that extent it works.

set_handle_refresh and HTTPRefreshProcessor are there to come into play when a webpage has a "refresh" meta that causes it to reload after a certain amount of time. I believe the parameter you gave (7 seconds, not minutes, again) is the maximum amount of time mechanize will honor between refreshes.

But in any case, Google's homepage doesn't refresh, I don't think, so this doesn't do anything there.

You can look into HTTP Keep-Alive/Persistent connections to see if there's something to do what you want, though even keep-alive connections aren't really pages being held open from the client's POV.

Use mechanize with python

1 Answers1