0

I'm using python-mechanize to scrape some web sites, which sometime simply don't respond to requests and these requests stay open too long, so I need to limit timeout for these requests.

While using urlopen method, the timeout can be set using timeout parameter, but I have not found easy way for doing it with high level API such as submit or click methods. Ideally the timeout would be set just once for whole browser class and all calls would honor that.

It would be probably possible to customize this by passing custom request_class to every click and submit call, but this would just pollute the code, so I'm looking for nicer solution for setting timeout for mechanize's browser class (and no, I don't want to change default socket timeout using socket.setdefaulttimeout).

Michal Čihař
  • 9,799
  • 6
  • 49
  • 87

2 Answers2

2

It is slightly frowned upon within the Python community, but you can "duck punch" the desired behaviour into the browser class.

Basically, you need to do the following. Create a function that does what you want (using a custom request class).

browser_click = Browser.click
def my_click(self, *args, **kwds):
    browser_click(self, request_class=MyRequestClass, *args, **kwds)
Browser.click = my_click

If that is too Ruby for your taste, you can create a subclass of Browser that does something similar.

class MyBrowser(Browser):
    def click(self, *args, **kwds):
        Browser.click(self, request_class=MyRequestClass, *args, **kwds)

This I find a bit cleaner, but it will not work in case you have no control the creation of your Browser instances.

Hans Then
  • 10,935
  • 3
  • 32
  • 51
1

You could try using a do-while loop with code such as:

start = time.clock()
... do something
elapsed = (time.clock() - start)

or

start = time.time()
... do something
elapsed = (time.time() - start)
JabberwockyDecompiler
  • 3,318
  • 2
  • 42
  • 54
Tim.DeVries
  • 791
  • 2
  • 6
  • 21