In the class foo
in foomodule.py
below, I am getting an error in the run_with_multiprocessing
method. The method breaks up the number of records in self._data
into chunks and calls somefunc()
using a subset of the data, for example somefunc(data[0:800], 800)
in the first iteration, if limit = 800
.
I have done this, because running 10 * 1k records vs. 1 * 10k records shows a great performance improvement in a variation of the run_with_multiprocessing
function that does the same thing, just without multiprocessing. Now I want to use multiprocessing
to see if I can improve performance even more.
I am running python 3.8.2 on Windows 8.1. I am fairly new to python and multiprocessing. Thank you so much for your help.
# foomodule.py
import multiprocessing
class foo:
def __init__(self, data, record_count):
self._data = data
self._record_count = record_count
def some_func(self, data, record_count):
# looping through self._data and doing some work
def run_with_multiprocessing(self, limit):
step = 0
while step < self._record_count:
if self._record_count - step < limit:
proc = multiprocessing.Process(target=self.some_func, args=(self._data[step:self._record_count], self._record_count-step))
proc.start()
proc.join()
step = self._record_count
break
proc = multiprocessing.Process(target=self.some_func, args=(self._data[step:self._record_count], self._record_count-step))
proc.start()
proc.join()
step += limit
return
When using the class in script.py
, I get the following error:
import foomodule
# data is a mysql result set with, say, 10'000 rows
start = time.time()
bar = foomodule.foo(data, 10000)
limit = 800
bar.run_with_multiprocessing(limit)
end = time.time()
print("finished after " + str(round(end-start, 2)) + "s")
Traceback (most recent call last):
File "C:/coding/python/project/script.py", line 29, in <module>
bar.run_with_multiprocessing(limit)
File "C:\coding\python\project\foomodule.py", line 303, in run_with_multiprocessing
proc.start()
File "C:\...\Python\Python38-32\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\...\Python\Python38-32\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\...\Python\Python38-32\lib\multiprocessing\context.py", line 326, in _Popen
return Popen(process_obj)
File "C:\...\Python\Python38-32\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
reduction.dump(process_obj, to_child)
File "C:\...\Python\Python38-32\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
File "C:\...\Python\Python38-32\lib\socket.py", line 272, in __getstate__
raise TypeError(f"cannot pickle {self.__class__.__name__!r} object")
TypeError: cannot pickle 'SSLSocket' object