0

I am having moderate success in using pathos and multiprocessing. However, pickling is an issue - and anything containing tkinter (my current GUI) seems to mean that I have to be extremely careful what class-level variable I use for multiprocessing. When using ProcessingPool on anything that also has tkinter instances inside the same class, it seems to pull in irrelevant data to pickle. This makes it (seemingly) unnecessarily tedious to do multiprocessing on anything that has something to do with tkinter. Is there a good reason for this?

More specifically, the following piece of code gives the desired result:

import tkinter as tk
from pathos.multiprocessing import ProcessingPool

class testpathos():
    def __init__(self):
        self.GUI = tk.Tk()
        self.testlist = [1,2,3,4,5]

    def testprocesspool(self):
        print(ProcessingPool().map(lambda x: squarenumber(x),self.testlist))               

   def squarenumber(x):
       return x**2 


testclass = testpathos()
testclass.testprocesspool()

which yields [1,4,9,16,25] as expected - with no errors.

However, the following - slightly extended - code

import tkinter as tk
from pathos.multiprocessing import ProcessingPool

class testpathos():
    def __init__(self):
        self.GUI = tk.Tk()
        self.testlist = [1,2,3,4,5]
        self.powerlist = [2,3,4,5]

    def testprocesspool(self):
        print(ProcessingPool().map(lambda x: powernumber(x,self.powerlist),self.testlist))

def powernumber(x,powerlist):
    return [x**i for i in powerlist]

testclass = testpathos()
testclass.testprocesspool()

gives me a

TypeError: can't pickle _tkinter.tkapp objects

now, the only difference is that I am passing a class-defined list to the input-function of ProcessingPool() - and that class-defined list happens to be defined where some tkinter stuff is also defined. If I remove the

self.GUI = tk.Tk()

line - which is irrelevant for the multiprocessing - I get [[1, 1, 1, 1], [4, 8, 16, 32], [9, 27, 81, 243], [16, 64, 256, 1024], [25, 125, 625, 3125]] as expected.

Workarounds for this, making it possible to use class-level objects directly even though they live alongside tkinter objects - as well as explanations why ProcessingPool work this way - are most welcome.

Tarje Bargheer
  • 175
  • 3
  • 8

1 Answers1

0

This makes it (seemingly) unnecessarily tedious to do multiprocessing on anything that has something to do with tkinter. Is there a good reason for this?

The reason for the error TypeError: can't pickle _tkinter.tkapp objects is that Tkinter is a fairly thin wrapper around an embedded tcl interpreter. This interpreter cannot be run in multiple processes at the same time -- it is locked to a single interpreter. Because of this, you can't pickle tkinter object since they require the underlying tcl interpreter and its internal state in order to function.

Bryan Oakley
  • 370,779
  • 53
  • 539
  • 685
  • Thanks. I have come to accept that I can't pickle tkinter objects. My question is more that in the top code bit, I send self.testlist through ProcessingPool().map - and I don't get any tkinter pickling errors - while in the second piece of code the only extra thing I send through ProcessingPool() is self.powerlist... which still hasn't got a lot to do with tkinter, but I get the pickling error. Is there a way to tell ProcessingPool that self.GUI shouldn't be send along through ProcessingPool - or a good reason this shouldn't be possible? – Tarje Bargheer Oct 17 '18 at 05:20