0

I use the pathos ProcessingPool class to schedule concurrent execution of the run_regex() function across multiple cores. The function takes a regular expression as an argument and evaluates list entries for a match. If a match is found, it puts the matching value into result_queue.

As I understand, currently each worker process creates a local copy of result_queue in its virtual address space. However, I'd like to use this Queue object as a shared memory mechanism in order to access all matches from the main process.

Questions:

  1. Is there a way to pass a Queue object into the Pool initializer, so the queue acts as a shared memory section?
  2. Is synchronization required with Queue objects?
  3. Is there a better way to approach this problem?

Code Snippet

from multiprocessing import Lock, Queue
from pathos.multiprocessing import ProcessingPool

result_queue = Queue()
lock = Lock()
data = {}

def run_regex(self, expr):

for key, value in data.iteritems():
    matchStr = re.search(expr, key, re.I)
    if matchStr:
        lock.acquire()
        result_queue.put(key)
        lock.release()
        break

def check_path(self):

    pool = ProcessingPool()
    pool.map(run_regex, in_regex)
zan
  • 355
  • 6
  • 16

1 Answers1

1
  1. Yes, you can take a look at the initializer parameter of the Pool object.
  2. Queue objects are already mp safe so there's no need to protect them.
  3. You don't need a Queue to return values from the run_regex function. You can simply return the key from the function and it will be made available by the map result.

    def run_regex(expr):
        group = []
    
        for key, value in data.iteritems():
            match = re.search(expr, key, re.I)
            if match is not None:
                group.append(key)
    
        return group
    
    groups = pool.map(run_regex, in_regex)
    keys = [key for group in groups for key in group]
    

    or

    keys = list(itertools.chain.from_iterable(groups))
    

    The map will return the keys grouped by the run_regex. You can easily flatten the list afterwards.

noxdafox
  • 14,439
  • 4
  • 33
  • 45