Optimizing iterative deepening depth first search in C++

Question

I am using an implementation of the IDDFS graph algorithm in a project and I noticed that for each node, the function takes around 8 - 10 seconds to compute. And, with a very large dataset with over 7000 nodes, it is proving to be the most cumbersome. So, I am trying to optimize my code so that it will run faster but all I have found so far is to change my function call by value parameters to being called by reference which has made a decent difference.

Are there any other ways I could fasten my code? I am compiling using c++11.

// Utility DFS function -- returns DFS Path if it exists, -1 if not exists
vector<int> dfs_util(vector<int> path, int target, vector<vector<int>> &adj_list, int depth)
{
    int curr_node = path.back();
    if (curr_node == target)
        return path;
    if (depth <= 0)
    {
        vector<int> tmp;
        tmp.push_back(CUST_NULL);
        return tmp;
    }
    for (auto child : adj_list[curr_node])
    {
        vector<int> new_path = path;
        new_path.push_back(child);
        vector<int> result = dfs_util(new_path, target, adj_list, depth - 1);
        if (result.back() != CUST_NULL)
        {
            return result;
        }
    }
    vector<int> tmp;
    tmp.push_back(CUST_NULL);
    return tmp;
}

// IDDFS Function -- returns IDDFS Path if it exists, -1 if not
vector<int> iddfs(vector<vector<int>> &adj_list, int src, int target, int max_depth = 2)
{
    vector<int> result;
    max_depth++;
    for (int depth = 0; depth < max_depth; depth++)
    {
        vector<int> path;
        path.push_back(src);

        result = dfs_util(path, target, adj_list, depth);

        if (result.back() == CUST_NULL || result.size() == 0)
            continue;
        int final_index = 0;
        int idx_count = 0;
        for (auto item : result)
        {
            if (item == src)
                final_index = max(final_index, idx_count);
            idx_count++;
        }
        result = vector<int>(result.begin() + final_index, result.end());
    }
    return result;
}

you can try to avoid copy - for example, `for (auto child : adj_list[curr_node])` -> `for (const auto& child : adj_list[curr_node])` — KoVadim, Apr 05 '22 at 14:12
why do you make copies of `path` ? you can use single instance of path vector: add node to path at start of `dfs_util` , remove node from path at the end if path not found — Andrew Kashpur, Apr 05 '22 at 14:20
I have not scrutinized your code but I guess that for such a tree traversal task, several seconds per node is pathological. Are you sure that your algorithm works the way it should ? Have you checked it by hand on a toy case ? — , Apr 05 '22 at 18:56
I assume you're compiling this with GCC or clang `-O3`. You didn't say what CPU you're testing on, so I assume some kind of modern x86. Shouldn't make much difference, but it's always best to report test details along with any time. — Peter Cordes, Apr 06 '22 at 03:04

Jérôme Richard · Answer 1 · 2022-04-05T18:46:47.917

Here are few points to optimize your code:

The parameter vector<int> path is copied for no apparent reasons. You can pass it by reference.
vector<int> new_path = path; create a new copy of path. This is not needed: you can mutate path (passed by reference) so to add an element before the recursive call and remove it after. You can also reserve some space initially so to avoid re-allocations.
Returning a vector<int> in a recursive function is generally not a good idea in term of performance. This is especially true here since the function sometime ruen a vector of size 1 with CUST_NULL inside so to return a special code. This is very inefficient since std::vector will allocate some memory for that and memory allocations are expensive. You can instead return it by reference or play with std::optional so to return a special code without having to create a new non-empty vector. Note that the mutation-based solution is probably faster.
Be aware that push_back do some check to reallocate the target vector if needed. While this is simple and flexible, this is not actually needed for path since its maximum size is known not to be bigger than a threshold (depth +/- 1). You can fix its size to this threshold and fill it based on an index provided in the recursive function.
Using vector<vector<int>> is not efficient since each item of this data structure (of type vector<int>) are likely stored non-contiguously (they may with some luck due to the way low-level allocator works and depending on how your code fill them). It is more efficient to use a big flat vector<int> virtually split in some parts. The starting location of each part can be put in another vector<size_t> (with n+1 items). This makes the code a bit more complex though.
You can unroll/inline the function code when depth is 1 so to avoid many function calls in that case (and copies in your current code). This strategy is not guaranteed to be faster (though it often is) if the compiled function assembly code becomes huge or if this case never happen in practice (eg. to fit in CPU instruction cache).

Optimizing iterative deepening depth first search in C++

1 Answers1