-1

I am using python multiprocessing functionality to parallelize the processing of a large raster dataset. It all seems to work fine. Once processing is complete I need to automatically delete all the files that have been created by all the parallel processes. However, this is failing with the following error:

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'xxxxxxxx.tif'

This is the code I'm using:

def my_function_1(info):
    ...

def my_function_2(...):
    ...

delete_func():
    ...

info = {}
info = .... #I populate a dict with the parameters I want to send to the function
pool = Pool(processes=16)
pool.map(my_function_1, info.items())
pool.close()
pool.join()  

my_function_2(...)    #Processing files created during the multiprocessing

delete_func()         #Delete files created during the multiprocessing

When calling delete_func() it starts deleting all the files created during the multiprocessing (hundreds), but at some point it throws the error above mentioned. It seems some process(es) is/are still holding on to one or some of the files. How can I make sure all processes are closed and all files are "free" to be deleted?

Pitrako Junior
  • 143
  • 1
  • 7
  • There really is no way to help without a [mcve]. You haven't even provided the full error message... The issues is almost certainly in the details for the functions you don't provide (I suspect `my_function_2`). – juanpa.arrivillaga Feb 01 '22 at 19:08
  • Also, this should not work on windows because your are not protecting your `pool.map` call with a `if __name__ == "__main__":` guard... this should create a multiprocessing bomb. – juanpa.arrivillaga Feb 01 '22 at 19:10
  • The whole code is about 600 lines. The only thing my_function_2 does is to merge all the tif files created during the multiprocessing into one unique large tif file. After the merge has taken place I don't need all the tiles of the mosaic any longer and just wish to delete them. What do you mean by "multiprocessing bomb"? – Pitrako Junior Feb 01 '22 at 19:18

1 Answers1

0

You can achieve this by changing those lines:

pool = Pool(processes=16)
pool.map(my_function_1, info.items())
pool.close()
pool.join()  

to this

with Pool(16) as pool:
   pool.map(my_function_1, info.items())

if you use this code it will execute the rest of the programme after all processes stoped

If you write to a file use this code

with open("path_to_your_file","the_mode_you_want") as file:
   file.write("text")
   #or do what ever you want with the file   

using the with format results in that so for example when you want to access a file. The file gets automatically closed if not in use anymore