1

I am encountering an error that can probably be worked around but conceptually confusing to me and wondering if someone could shed some light.

When I recurse through directory structure calling shutil.rmtree() under certain condition, I get an error that amounts to an attempt to delete an already deleted object. Here is a simplest example:

def deleteFolder(path):
    for obj in glob.glob(os.path.join(path, '*')):
        print('obj name is ',obj)
        if os.path.isdir(obj):
            deleteFolder(obj)
    print('removing path ',path)
    print(os.listdir(path))
    shutil.rmtree(path,False)


WindowsError                              Traceback (most recent call last)
<ipython-input-36-ad9561d5fc92> in <module>()
      2 #path = 'C:\Users\Wes\Desktop\Test\Morphine\Album1'
      3 #shutil.rmtree(path)
----> 4 deleteFolder('C:\Users\Wes\Desktop\Test\Level1')

<ipython-input-35-14a315bc6a80> in deleteFolder(path)
     30     print('removing path ',path)
     31     print(os.listdir(path))
---> 32     shutil.rmtree(path,False)

C:\Users\Wes\Anaconda2\lib\shutil.pyc in rmtree(path, ignore_errors, onerror)
    250                 os.remove(fullname)
    251             except os.error, err:
--> 252                 onerror(os.remove, fullname, sys.exc_info())
    253     try:
    254         os.rmdir(path)

C:\Users\Wes\Anaconda2\lib\shutil.pyc in rmtree(path, ignore_errors, onerror)
    248         else:
    249             try:
--> 250                 os.remove(fullname)
    251             except os.error, err:
    252                 onerror(os.remove, fullname, sys.exc_info())

WindowsError: [Error 2] The system cannot find the file specified: 'C:\\Users\\Wes\\Desktop\\Test\\Level1\\Level2'

Directory structure is /Level1/Level2/Level3 and fx is called w/ arg Level1. Obviously this is a stupid example, it is doing the recursion that shutil.rmtree is built for, but when you add a condition to whether or not to delete the directory it makes more sense.

Here is the print output:

('obj name is ', 'C:\\Users\\Wes\\Desktop\\Test\\Level1\\Level2')
('obj name is ', 'C:\\Users\\Wes\\Desktop\\Test\\Level1\\Level2\\Level3')
('removing path ', 'C:\\Users\\Wes\\Desktop\\Test\\Level1\\Level2\\Level3')
[]
('removing path ', 'C:\\Users\\Wes\\Desktop\\Test\\Level1\\Level2')
[]
('removing path ', 'C:\\Users\\Wes\\Desktop\\Test\\Level1')
['Level2']

So it seems to travel down to Level3, delete Level3, moves up to Level2, has no problem seeing that Level3 is no longer a subdir of Level2, deletes Level2, but then Level1 still see Level2 and errors. There seems to be some subtlety of scoping as it relates to os.path that I am missing.

Ultimately, I would like to go explore an entire tree starting at some root and prune directories which have no descendants that meet some certain criteria (contain audio files).

Wes
  • 13
  • 4
  • could you put the output complete with entire traceback all together, I'm not entirely sure what is going wrong. – Tadhg McDonald-Jensen Aug 14 '16 at 22:12
  • Edited above to include full trace – Wes Aug 14 '16 at 22:21
  • hmm... is this reproducible? (if you put the folders how they were and run it again do you get the same error?) `shutil.rmtree` uses `os.listdir` to find content to delete so the fact that `os.listdir` incorrectly reports that `Level2` is still present is definitely the cause of the problem. – Tadhg McDonald-Jensen Aug 14 '16 at 22:38
  • Unfortunately it is very reproducible ('obj name is ', 'C:\\Users\\Wes\\Desktop\\Test\\Level1a\\Level2a') ('obj name is ', 'C:\\Users\\Wes\\Desktop\\Test\\Level1a\\Level2a\\Level3a') ('removing path ', 'C:\\Users\\Wes\\Desktop\\Test\\Level1a\\Level2a\\Level3a') [] ('removing path ', 'C:\\Users\\Wes\\Desktop\\Test\\Level1a\\Level2a') [] ('removing path ', 'C:\\Users\\Wes\\Desktop\\Test\\Level1a') ['Level2a'] – Wes Aug 14 '16 at 22:41
  • My code works as I need it to by just flipping the ignore error flag to True in shutil.rmtree, I just thought I was misunderstanding something about scope. Doing some further reading it seems it might just be a problem with the way Windows marks files for upcoming deletion. – Wes Aug 14 '16 at 22:59
  • Fascinating! I wonder if the result of `os.listdir` is cached somehow... one thing to try is since you already know you deleted all the sub contents you could try using `os.rmdir` instead of `shutil.rmtree` – Tadhg McDonald-Jensen Aug 14 '16 at 23:03
  • I notice `shutil` in python 3 has two versions, one that is "vulnerable to race conditions" which is the one used in python 2.7 and one "using fd-based APIs to protect against races", so if you happen to have python 3 installed I would be interested in if `shutil.rmtree.avoids_symlink_attacks` is True in python3 on your machine and if your original code raises the same error in python 3. – Tadhg McDonald-Jensen Aug 14 '16 at 23:23
  • @TadhgMcDonald-Jensen, the safe version isn't available on Windows since the Windows API doesn't expose the kernel's ability to open a path relative to an existing File handle, which it has been able to do since NT 3.1 was released in 1993. Microsoft should expose this feature in the Windows API. It would be easy to version the [`SECURITY_ATTRIBUTES`](https://msdn.microsoft.com/en-us/library/aa379560) structure to support this. – Eryk Sun Aug 15 '16 at 01:45
  • @Wes, at first glance this looks like a race condition in the filesystem. It looks like Level2 has been removed but not yet unlinked from the Level1 parent directory, so it's still included in the directory listing. Try adding a millisecond delay after printing "removing path ..." to see if the problem goes away. I'm not saying that's a proper solution. It's just diagnostic. – Eryk Sun Aug 15 '16 at 01:57

1 Answers1

0

I think, your problem can be linked with this one: Permission denied doing os.mkdir(d) after running shutil.rmtree(d) in Python

shutil.rmtree on Windows happens to return not when files are actually deleted. You can imagine that this is done in some asynchronous manner so that consequent rmtree calls may conflict. Also that's why pip install fails sometimes when deletes its cached files using shutil.rmtree during the cleanup phase.

Try to put time.sleep(1) after each rmtree call - does it help? If it does, your solution would be either the retrying to delete files after such an error or collecting the directories to delete and remove them selectively to avoid conflicts.

Community
  • 1
  • 1
gukoff
  • 2,112
  • 3
  • 18
  • 30
  • Deleting files and directories in Windows uses an open handle to set the delete disposition on the underlying filesystem control block. There may be other handle and kernel pointer references (e.g. maybe a malware scanner). All of these references are tracked, and the file is removed and unlinked only when the reference count drops to zero . By this point it's possible another call has already started listing the parent directory, which sees a file or directory that's slated for deletion. – Eryk Sun Aug 15 '16 at 08:16
  • A loop after the recursive `deleteFolder` call can check `os.listdir` until `obj` is unlinked. Maybe check for up to second before raising an exception. – Eryk Sun Aug 15 '16 at 08:24
  • This seems to be the answer. Thanks eryksun I will try your solution to avoid the fixed wait time. – Wes Aug 15 '16 at 14:24