-2

Using python 2.7 I have this code:

import urllib2
import time
import os
import sys
import multiprocessing
from multiprocessing import Pool
WORKER_COUNT = 10

def worker(url):
    print url
    try:
        response = urllib2.urlopen(url)
        html = response.read()
    except Exception as ex:
        pass

if __name__ == "__main__":
    urls = [
        'http://localhost:90/',
        'http://localhost:90/Default.aspx',
        'http://localhost:91/?a=2&m=0',
        'http://localhost:91/?a=2&m=1',
        'http://localhost:91/?a=2&m=2',
        'http://localhost:91/?s=2',
        'http://localhost:91/?a=2&ft=false',
        'http://localhost:91/?a=2&f=false',
        'http://localhost:91/?fail=1',
        'http://localhost:91/?fail=query',
        'http://localhost:92/?a=2&m=0',
        'http://localhost:92/?a=2&m=1',
        'http://localhost:92/?a=2&m=2',
        'http://localhost:92/?s=2'
        ]
    while True:
        p = Pool(WORKER_COUNT)
        p.map(worker, urls[0:4])# too many urls will cause it to freeze up
        print "Restart"
        time.sleep(5)

when I use all the urls (there might be 30 urls), it freezes up after running through the first set. When I use 5 of the urls as in the above code (url[0:4]), then it does not freeze up. Any ideas why? The code is a test code so it is supposed to run forever (for hours).

max
  • 9,708
  • 15
  • 89
  • 144
  • You shouldn't be recreating the `Pool` object inside the while loop. Create it outside. – dano Jul 03 '14 at 22:11

1 Answers1

0

ops, This problem was not related to the Pool. It's some of the urls that freeze up. I fixed it by adding a timeout:

import socket
try:
    response = urllib2.urlopen(url, timeout=10)
    html = response.read()
except socket.timeout:
    print "timout"
    pass

Also be aware that using too many workers will cause the application to freeze up due to lack of resources.

max
  • 9,708
  • 15
  • 89
  • 144