2

I need to find all regular files in a directory, and would like to use the C++20 ranges (not Eric Niebler's range-v3) library. I came up with the following code:

namespace fs = std::filesystem;

std::vector<fs::directory_entry> entries{ fs::directory_iterator("D:\\Path"), fs::directory_iterator() };

std::vector<fs::path> paths;
std::ranges::copy(entries |
    std::views::filter([](const fs::directory_entry& entry) { return entry.is_regular_file(); }) |
    std::views::transform([](const fs::directory_entry& entry) { return entry.path(); }),
    std::back_inserter(paths));

This works, but I'm uncomfortable with the additional boilerplate of using lambdas; I'm used to the Java 8 streams library, and I don't see why I can't just use member functions directly. This was my first attempt at refactoring:

std::ranges::copy(entries |
    std::views::filter(fs::directory_entry::is_regular_file) |
    std::views::transform(fs::directory_entry::path),
    std::back_inserter(paths));

This resulted in compiler errors:

error C3867: 'std::filesystem::directory_entry::is_regular_file': non-standard syntax; use '&' to create a pointer to member
error C3889: call to object of class type 'std::ranges::views::_Filter_fn': no matching call operator found
...

So I tried this:

std::ranges::copy(entries |
    std::views::filter(&fs::directory_entry::is_regular_file) |
    std::views::transform(&fs::directory_entry::path),
    std::back_inserter(paths));

This fixed the first error, but not the second:

error C3889: call to object of class type 'std::ranges::views::_Filter_fn': no matching call operator found
...

So I found Using member variable as predicate, which looked promising, so I tried:

std::ranges::copy(entries |
    std::views::filter(std::mem_fn(&fs::directory_entry::is_regular_file)) |
    std::views::transform(std::mem_fn(&fs::directory_entry::path)),
    std::back_inserter(paths));

This resulted in new compiler errors:

error C2672: 'std::mem_fn': no matching overloaded function found
...

Note, std::bind doesn't appear to work either. Any help would be appreciated, thanks!

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
Huw Walters
  • 1,888
  • 20
  • 20
  • Use `static_cast` see dupe: [std::abs with std::transform not working](https://stackoverflow.com/a/35638933/12002570) – Jason Nov 06 '22 at 12:24
  • It's so sad that [`BOOST_HOF_LIFT`](https://www.boost.org/doc/libs/1_80_0/libs/hof/doc/html/include/boost/hof/lift.html) exists, [`boost::hof::construct`](https://www.boost.org/doc/libs/1_80_0/libs/hof/doc/html/include/boost/hof/construct.html) exists, but no utility for memeber functions exists! :( Should I find something, I'll list it [here](https://stackoverflow.com/questions/65811716/most-terse-and-reusable-way-of-wrapping-template-or-overloaded-functions-in-func/68357266#68357266). – Enlico Nov 06 '22 at 12:34
  • @Enlico I have added a simple macro of that sort to my answer. I haven't tested it and expect it to have some edge cases not covered. It also requires C++20 in contrast to the boost macros, but it should show that there is no general issue with implementing such a macro. – user17732522 Nov 06 '22 at 13:17
  • @user17732522, looks good, thanks! I've made a fix to it. Manually tested on a simple case. – Enlico Nov 06 '22 at 14:19
  • @Enlico Yes thanks, the double `requires` is easy to forget. – user17732522 Nov 06 '22 at 15:44
  • @user17732522, I think you might be interested in to [this](https://stackoverflow.com/questions/74394991/what-is-the-minimal-way-to-write-a-free-function-to-get-the-member-of-a-class) related question of mine. – Enlico Nov 10 '22 at 20:34

2 Answers2

6

Just &fs::directory_entry::is_regular_file as argument is in principle correct, assuming that there is only one non-template overload for the function. Pointers can only point to one function (or function template specialization), not to an overload set.

However per standard there are two overloads for directory_entry::is_regular_file. To select one of them for the pointer you would need to add an explicit cast directly around the pointer with the target pointer type matching the overload's type you want to select. In this special case the & operator will then select the function matching the target type from the overload set.

But even then, the standard says that behavior is unspecified if you try to take any reference or pointer to a non-static member of a standard library class. This basically allows the standard library implementer to change the overload set as long as direct calls to the functions behave as if there were exactly the overloads specified in the standard.

Using lambdas as in your first example is the intended use and the only one that is guaranteed to work. You can reduce the boiler-plate a bit though. You don't need to repeat the argument type.

[](auto& entry) { return entry.is_regular_file(); }

will work as well.

If you need this often and you are annoyed by typing out the lambdas, you can also write yourself a macro for it. Something like

#define LIFT_MEMBER_FUNC(func) \
    ([](auto&& obj, auto&&... args) \
    noexcept(noexcept((decltype(obj)(obj)).func(decltype(args)(args)...))) \
    -> decltype(auto) \
    requires requires { (decltype(obj)(obj)).func(decltype(args)(args)...); } \
    { return (decltype(obj)(obj)).func(decltype(args)(args)...); })

and then

std::views::filter(LIFT_MEMBER_FUNC(is_regular_file))

Note that I have not tested the macro and that there may be edge cases I haven't considered. Take it as a guideline to how such a macro may look. Simplified versions that drop the requires clause (making it non-SFINAE-friendly) or that drop the noexcept line (making it not forward noexcept) or replacing decltype(X)(X) with just X (making it not perfectly-forwarding) would also work in most typical situations.

The noexcept forwarding expects that there won't be any copy/move constructor call for the lambda return value, so it is correct only for C++17 or later and the requires clause would need to be replaced with SFINAE or dropped before C++20.

user17732522
  • 53,019
  • 2
  • 56
  • 105
  • Thanks for the detailed answer! Since I'm more familiar with the Java equivalent (I spent 10 years away from C++, and came back to a much changed language!) I had not anticipated the function overload issue. – Huw Walters Nov 06 '22 at 15:46
  • Not sure I want to go down the macro route, but I will use auto as you suggest. – Huw Walters Nov 06 '22 at 15:47
  • It also occurs to me, I could probably encapsulate the lambda functionality in inline helper functions, which would be more type-safe than a macro. – Huw Walters Nov 06 '22 at 15:49
  • @HuwWalters Unfortunately the macro can't be replaced by a function because you would get into exactly the same problem you had originally that you cannot pass an overload set as a function argument. You can of course use a function (whether inline or not) instead of the lambda or store the lambda in some variable to reuse it, but you would need to define one function for each member function name that you want to use. (Sadly we lack reflection to automate this and so macros are at the moment the only choice.) – user17732522 Nov 06 '22 at 15:52
  • Yes, I was thinking to define one function per member function. But probably simpler to stick with lambdas, thanks. – Huw Walters Nov 06 '22 at 16:06
  • @HuwWalters, [there's absolutely no way to abstract the creation of the lamda wrapping an overloaded/templated (member) function without using macros](https://stackoverflow.com/questions/65811716/most-terse-and-reusable-way-of-wrapping-template-or-overloaded-functions-in-func). Plus, in this case, there's no reason not to use it. It is just another abstraction. – Enlico Nov 06 '22 at 17:37
0

As another answer points out, taking the address of C++ standard library functions is not guaranteed to work. But it is not undefined behavior, and as long as your unit tests cover this code, it will probably be fine (unless you're a language lawyer or portability connossieur). You just need to disambiguate which of the is_regular_file overloads you mean:

using bool_method = bool (fs::directory_entry::*)() const;

std::ranges::copy(entries |
    std::views::filter(static_cast<bool_method>(&fs::directory_entry::is_regular_file)) |
    std::views::transform(&fs::directory_entry::path),
    std::back_inserter(paths));

If you're wondering why C++ doesn't guarantee it will work, it is because of cases exactly like this: they want standard library implementations to be able to implement the standard library API without worrying about how many overloads are used.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436