38

I want to retrieve all the matching paths following this pattern in a vector<string>:

"/some/path/img*.png"

How can I simply do that ?

Benjamin Crouzier
  • 40,265
  • 44
  • 171
  • 236

5 Answers5

62

I have that in my gist. I created a stl wrapper around glob so that it returns vector of string and take care of freeing glob result. Not exactly very efficient but this code is a little more readable and some would say easier to use.

#include <glob.h> // glob(), globfree()
#include <string.h> // memset()
#include <vector>
#include <stdexcept>
#include <string>
#include <sstream>

std::vector<std::string> glob(const std::string& pattern) {
    using namespace std;

    // glob struct resides on the stack
    glob_t glob_result;
    memset(&glob_result, 0, sizeof(glob_result));

    // do the glob operation
    int return_value = glob(pattern.c_str(), GLOB_TILDE, NULL, &glob_result);
    if(return_value != 0) {
        globfree(&glob_result);
        stringstream ss;
        ss << "glob() failed with return_value " << return_value << endl;
        throw std::runtime_error(ss.str());
    }

    // collect all the filenames into a std::list<std::string>
    vector<string> filenames;
    for(size_t i = 0; i < glob_result.gl_pathc; ++i) {
        filenames.push_back(string(glob_result.gl_pathv[i]));
    }

    // cleanup
    globfree(&glob_result);

    // done
    return filenames;
}
Trevor Boyd Smith
  • 18,164
  • 32
  • 127
  • 177
Piti Ongmongkolkul
  • 2,110
  • 21
  • 20
  • 5
    This does not check for any errors and will leak memory if any of the vector operations throws – wakjah Mar 23 '17 at 11:59
  • 1
    @wakjah i also noticed the lack of error checking (big no-no). so I edited the code and saved it just a moment ago. – Trevor Boyd Smith May 10 '18 at 16:49
  • 1
    where is the glob.h file? – chaohuang Sep 22 '21 at 04:14
  • 1
    @chaohuang `glob.h` is a system header on unix/posix systems. it should be in the same location as other headers such as `stdio.h` etc. – grandchild Nov 12 '21 at 17:43
  • @wakjah you mind explaining that? They are already calling globfree after populating the vector, and if an error occurred which prevented populating the vector because glob_result.gl_pathc == 0, it won't stop this code from continuing to call globfree() after that check. What you just asked this user to do is a double free, which is a big no-no last I checked. Sure, you could always abruptly abort your application with an error in attempts to prevent a self-crafted double-free, but most people will not want their app to do that. –  Jan 22 '22 at 20:13
  • 1
    I did not ask the user to free the glob twice. At the time I made the comment there was no error checking at all in the posted code. Some error checking has now been added, but the code is still not great as any of the vector `push_back` operations or string constructions in the loop could throw `std::bad_alloc`, which would then cause `glob_result` to be leaked. Ideally since this is C++ we should create a RAII wrapper for `glob_t`, but that's far outside the scope of a comment. – wakjah Jan 24 '22 at 18:21
  • @wakjah thank you for the clarification. I was sick when posted that and probably wasn't thinking right. –  Jan 25 '22 at 13:11
9

I wrote a simple glob library for Windows & Linux (probably works on other *nixes as well) a while ago when I was bored, feel free to use it as you like.

Example usage:

#include <iostream>
#include "glob.h"

int main(int argc, char **argv) {
  glob::Glob glob(argv[1]);
  while (glob) {
    std::cout << glob.GetFileName() << std::endl;
    glob.Next();
  }
}
Martijn Courteaux
  • 67,591
  • 47
  • 198
  • 287
szx
  • 6,433
  • 6
  • 46
  • 67
  • 2
    This header name will probably conflict on Unix systems if someone is using a build system and includes using <> instead of "" – 0xB00B Dec 28 '21 at 02:30
9

You can use the glob() POSIX library function.

R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
sth
  • 222,467
  • 53
  • 283
  • 367
  • 1
    glob() is POSIX not C99, and it does exist on any modern POSIX-compliant UNIX distrubtion. – jim mcnamara Dec 06 '11 at 22:14
  • Yes, that's right. The point I really was trying to make is that this is a C-style function (producing `char**` and not `std::vector` etc.). But it's still the correct function to use when programming in C++. It also should be rather easy to write a wrapper that provides a "C++-style" interface. – sth Dec 06 '11 at 22:56
  • 1
    I think the OP wanted to glob a vector of strings, and from what I can see the glob(3) function actually looks in the filesystem... Is there any library function that does not access the filesystem -- I need it too. – user9645 Sep 04 '14 at 19:26
6

For newer code to the C++17 standard, std::filesystem exists and it can achieve this with std::filesystem::directory_iterator and the recursive version. You will have to manually implement the pattern matching. For instance, the C++11 regex library. This will be portable to any platform with C++17 support.

std::filesystem::path folder("/some/path/");
if(!std::filesystem::is_directory(folder))
{
    throw std::runtime_error(folder.string() + " is not a folder");
}
std::vector<std::string> file_list;

for (const auto& entry : std::filesystem::directory_iterator(folder))
{
    const auto full_name = entry.path().string();

    if (entry.is_regular_file())
    {
       const auto base_name = entry.path().filename().string();
       /* Match the file, probably std::regex_match.. */
       if(match)
            file_list.push_back(full_name);
    }
}
return file_list;

A similar API is also implemented in boost for non-C++17 cases. std::string::compare() might be sufficient to find a match, including multiple calls, with len and pos arguments to match sub-strings only.

artless noise
  • 21,212
  • 6
  • 68
  • 105
1

I've tried the solutions above on Centos6, and I found out that I needed to change:

int ret = glob(pat.c_str(), 0, globerr, &glob_result);

(where "globerr" is an error handling function)

Without the explicit 0, I got "GLOB_NOSPACE" error.

mousomer
  • 2,632
  • 2
  • 24
  • 25