5

I am trying to return a list of all files and subfolders in a particular location. My code is as follows:

from pathlib import Path
FOLDER_PATH = Path(r'C:\long\file\path\of\138\characters\')

I get the error: FileNotFoundError: [WinError 3] The system cannot find the path specified:

The error occurs on a folder path, not a file, so I'm not sure if that could be the reason.

When I go into the folder manually and try to open the PDF in there, I get "There was an error opening this document. This file cannot be found."

Similarly, when I try to open the XLSX file, I get "This file could not be accessed. Try one of the following: (make sure it exists, isn't read only, isn't more than 218 characters, etc.)"

The file paths in this folder are certainly more than 218 characters, which I understand can be an issue for Excel, but I don't understand why it would be in issue for pathlib.Path.rglob to list them, does anyone understand this?

However, if I use CMD (dir /s /b > files.txt) I am able to get the list.

Additionally, if I then import files.txt into a list of Path objects, paths, in python and try to do [x.is_file() for x in paths], it will not properly identify some of the longer paths as files.

I have verified that if I copy the directory locally (where a much shorter path exists) that the files are accessible by Excel and pathlib.Path.rglob.

What can be done to work around this issue, and why is it an issue in the first place?

teepee
  • 2,620
  • 2
  • 22
  • 47
  • [Microsoft documentation](https://learn.microsoft.com/en-us/windows/desktop/fileio/naming-a-file#maximum-path-length-limitation) says that the maximum path length for most functions in the Windows API is 260 characters. – Barmar Apr 23 '19 at 16:31

1 Answers1

12

The problem is that most Windows filesystem functions don't accept paths that look like:

r'C:\long\file\path\of\256\characters'

So pathlib and Excel both discover that they can't open the file, or read the directory, using those Windows functions.

The good news is that Windows functions do accept paths that look like:

r'\\?\C:\long\file\path\of\256\characters'

The bad news is that pathlib does not always correctly join paths of this kind:

>>> Path(r'\\?\foo').joinpath(r'\\?\bar')
WindowsPath('//?/foo/bar')  # correct
>>> Path(r'\\?\foo', r'\\?\bar')
WindowsPath('//?/bar')  # incorrect
>>> Path(r'\\?\c:\foo').joinpath(r'c:\bar')
WindowsPath('c:/bar')  # correct, but not the result we want

The other bad news is that such paths are somewhat limited: when the path that goes to a Windows filesystem function starts with \\?\, you can't use forward slashes, or single or double dots.

The good news is that a function like the following will convert pretty much any messy path you've come up with, into something that works:

def longname(path):
    return pathlib.Path('\\\\?\\' + os.fspath(path.resolve()))

Beware that resolve() only removes \\?\ from the start of path if path actually exists, so the code above doesn't work in the case where path doesn't exist and already has \\?\. So, either make sure that your program uses "ordinary" paths without a prefix, and calls longname() as the last thing before doing any real file operations, or else enhance longname():

def longname(path):
    normalized = os.fspath(path.resolve())
    if not normalized.startswith('\\\\?\\'):
        normalized = '\\\\?\\' + normalized
    return pathlib.Path(normalized)

The Windows behaviour is documented by Microsoft: https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file#maximum-path-length-limitation

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • From the link you have shared, it seems that dots and double dots are accepted as part of file names (e.g. `\file..txt`), just not interpreted as relative paths as in `\..\file.txt`. – Dr_Zaszuś Jun 19 '20 at 10:29
  • Could we adjust this so it would work on windows and linux? :) – Roelant Dec 07 '22 at 12:50