14

How can I determine if file is contained by path with boost filesystem v3.

I saw that there is a lesser or greater operator but this seems to be only lexical. The best way I saw was the following:

  • Take the two absolute paths of the file and the path
  • Remove the last part of the file and see if it equals the path (if it does it's contained)

Is there any better way to do this?

Parker Coates
  • 8,520
  • 3
  • 31
  • 37
David Feurle
  • 2,687
  • 22
  • 38

2 Answers2

18

The following function should determine whether a file name lies somewhere within the given directory, either as a direct child or in some subdirectory.

bool path_contains_file(path dir, path file)
{
  // If dir ends with "/" and isn't the root directory, then the final
  // component returned by iterators will include "." and will interfere
  // with the std::equal check below, so we strip it before proceeding.
  if (dir.filename() == ".")
    dir.remove_filename();
  // We're also not interested in the file's name.
  assert(file.has_filename());
  file.remove_filename();

  // If dir has more components than file, then file can't possibly
  // reside in dir.
  auto dir_len = std::distance(dir.begin(), dir.end());
  auto file_len = std::distance(file.begin(), file.end());
  if (dir_len > file_len)
    return false;

  // This stops checking when it reaches dir.end(), so it's OK if file
  // has more directory components afterward. They won't be checked.
  return std::equal(dir.begin(), dir.end(), file.begin());
}

If you just want to check whether the directory is the immediate parent of the file, then use this instead:

bool path_directly_contains_file(path dir, path file)
{
  if (dir.filename() == ".")
    dir.remove_filename();
  assert(file.has_filename());
  file.remove_filename();

  return dir == file;
}

You may also be interested in the discussion about what "the same" means with regard to operator== for paths.

Rob Kennedy
  • 161,384
  • 21
  • 275
  • 467
  • This function should live within Boost. I assume it's your copyright, then may I ask you what are the terms of use? Public domain, please? – vinipsmaker Aug 18 '14 at 18:40
  • @Vinipsmaker, the terms of use for my code are the same as the terms of use for everything on the entire Stack Exchange network, which are linked from the bottom of every page: "User contributions licensed under CC By-SA 3.0 with attribution required." – Rob Kennedy Aug 18 '14 at 19:03
  • 3
    I believe this function has a flaw impeding security, consider ``path_contains_file("folder1/folder2", "folder1/folder2/../../secretFolder/sectedFile.txt")`` So if you use it for access checks, they might wrongfully pass. Using ``filesystem::canonical`` would fix this, but only for existing files! – PhilLab Dec 19 '16 at 09:19
  • Any `filename()` in `dir` is supposed to be a directory name that is misinterpreted as a filename, since by definition a dir is a "dir", not a file. Don't you think that that `if` should be like this: `if (dir.filename() == ".") dir.remove_filename(); else dir /= "/";`? – The Quantum Physicist May 01 '17 at 10:12
  • I think my code is fine the way it is, @The, but if you think it should behave differently, then you're free to write whatever else you want. Be sure to write some test cases. Also remember that this code doesn't actually look at the disk to see what's what. It's all just string-manipulation in memory, so there aren't necessarily any files *or* directories. Recall that for most platforms, `filename()` simply returns the final component of `dir` regardless of what's in disk. – Rob Kennedy May 01 '17 at 11:36
  • @RobKennedy No sure, I changed it the way I want. But we're discussing to get the best out of it :) . What you mentioned about string manipulation is exactly why I mentioned this issue, because the interpretation of the string depends on the semantics of what we're doing. When a user writes `/home/user/something` in the first parameter, then `something` ***must*** be a directory, based on the semantics of what you're doing in your function (it's a `dir`, right?). However, it'll be treated in your function as a file and `something` will be dropped. Check that please. – The Quantum Physicist May 01 '17 at 11:42
6

If you just want to lexically check if one path is a prefix of another, without worrying about ., .. or symbolic links, you can use this:

bool path_has_prefix(const path & path, const path & prefix)
{
    auto pair = std::mismatch(path.begin(), path.end(), prefix.begin(), prefix.end());
    return pair.second == prefix.end();
}

Note that the four parameter overload of std::mismatch used here wasn't added until C++14.

Of course, if you want more than a strictly lexical comparison of the paths, you can call lexically_normal() or canonical() on either or both parameters.

Parker Coates
  • 8,520
  • 3
  • 31
  • 37
  • this is cleaner and should be faster. – Marek R Mar 29 '19 at 12:50
  • FTR, `path_has_perfix("/tmp/foo", "/tmp/") == false`, because "/tmp/" is `[/, tmp, .]`, while `path_has_perfix("/tmp/foo", "/tmp") == true`. The complexity of @Rob Kennedy is warranted – erenon Apr 30 '20 at 10:03