0

So I want an interface that stores some numbers in order and efficiently perform some operations on them. In practise, a sorted vector performs very well for small sizes, but for theoretical performance and performance on larger instances, I need also a variant that uses something like std::set to perform inserts and erases in O(log n) time.

Here some implementation for the vector variation:

#include <vector>
#include <algorithm>

class Numbers {
public:
    template<class ForwardIt>
    void unguarded_insert(ForwardIt it, int val) { vec.insert(it, val); }

    auto lower(int val) { return std::lower_bound(vec.begin(), vec.end(), val); }
    
    template<class ForwardIt>
    void unguarded_replace(ForwardIt it, int val) { *it = val; }

    template<class ForwardIt>
    auto erase_elements(ForwardIt first, ForwardIt last) { return vec.erase(first, last); }
    
private:
    std::vector<int> vec;
};

Note that for insertion and replacing it it ensured by the caller that they are at the right position. For example the caller might want to delete all elements between 20 and 50 and insert 30. So one can call lower_bound, replace the first element and delete the remaining range.

I do not want to write the entire interface twice and would like a templated solution using the container type as template argument. However, I am running into problems like lower_bound is a member function for some containers and a free function for other containers. Also insertion and replacing at an iterator cannot really be done by std::set because the order is unnecessarily checked (right?).

So is there some way to fix these problems or a different kind of approach that solves my problem? The solution should not do unnecessary work on either container just to make it work on both.

Henk
  • 826
  • 3
  • 14
  • **It does not make much senses** A `vector` does not have a member `lower_bound` because a vector does have any sort order associated to it. When you use the free function, you as the programmer know how the vector is sorted. On the other end, for a set it make sense to have a member function as `lower_bound` would work correctly only when comparison do match and it is more efficient because the set knows how to find nodes with minimal comparisons. In practice, most of the time it is easy to select appropriate container either for efficiency (for large data) or for convenience (small data). – Phil1970 Jan 27 '21 at 23:24
  • And if you use C++, then you should follow C++ philosophy and not work around it. **And if your data is complex and have complex requirement for accessing it, then you probably want to define a class that manage data so that callers need not to be aware of implementation details like which container was selected. This is even more true, if you need more than one container maybe with different sort keys. – Phil1970 Jan 27 '21 at 23:29
  • I'd just use `set` for everything. – Caleth Jan 28 '21 at 09:14
  • This is really slow (like even for hundreds of elements it was 50% slower). You should basically never use ```std::set``` as standard. – Henk Jan 28 '21 at 09:15
  • I can still have a wrapper around both versions of containers that chooses the more efficient one depending on a threshold. Code duplication is not ```C++``` philosophy and makes the code much harder to maintain. (When adding a function, you have to do it for both). – Henk Jan 28 '21 at 09:17
  • It's *slower*, but I wouldn't call it *slow*, at hundreds of elements – Caleth Jan 28 '21 at 09:19
  • 1
    See also [boost::flat_set](https://www.boost.org/doc/libs/1_75_0/doc/html/boost/container/flat_set.html) – Caleth Jan 28 '21 at 09:28

1 Answers1

1

You may not need to implement all of the member functions multiple times, but for some, you need to do it in different ways.

First, a type trait to check the existance of a lower_bound member function

template<class T>
struct has_lower_bound_member_func {
    struct no {};

    template<class U = T>  // on SFINAE match: `auto` is the type of the iterator
    static auto check(int) -> 
        decltype(std::declval<U>().lower_bound(
            std::declval<typename U::value_type>()));

    static no check(long); // No lower_bound found, the type is `no`

    // if the type is *not* `no`, the container has a lower_bound member function
    static constexpr bool value = not std::is_same_v<decltype(check(0)), no>;
};

template<class T>
inline constexpr bool has_lower_bound_member_func_v = 
                          has_lower_bound_member_func<T>::value;

A type trait to check if an iterator is a const_iterator:

template<class It>
struct is_const_iterator {
    using pointer = typename std::iterator_traits<It>::pointer; 

    static const bool value = std::is_const_v<std::remove_pointer_t<pointer>>;
};

template<class T>
inline constexpr bool is_const_iterator_v = is_const_iterator<T>::value;

Here are a few examples how those traits (and any future traits you add) could be used. You can use similar techniques for the other member functions that need special care:

template<class C>
class Numbers {
public:
    using value_type = typename C::value_type;

    template<class T = C>
    auto lower(const value_type& val) {

        if constexpr (has_lower_bound_member_func_v<T>) {
            // a std::set<> will use this
            return data.lower_bound(val);

        } else {
            // a std::vector<> will use this
            return std::lower_bound(data.begin(), data.end(), val);
        }
    }

    template<class ForwardIt>
    auto replace(ForwardIt it, const value_type& val) {
        if constexpr(is_const_iterator_v<ForwardIt>) {

            // move the object out of the container by moving its node
            // out of the set and reinsert if afterwards:

            auto node_handle = data.extract(it);
            node_handle.value() = val;
            return data.insert(std::move(node_handle));

            // or, if "it" is likely pointing close to the old value:
            // return data.insert(it, std::move(node_handle));

        } else {            
            *it = val; // not a const_iterator, assign directly
            // You probably need to sort here:
            std::sort(data.begin(), data.end());
            it = std::lower_bound(data.begin(), data.end(), val);
            // ... or do a clever std::rotate() to get it in place.
            return it;
        }
    }
    
private:
    C data;
};
Ted Lyngmo
  • 93,841
  • 5
  • 60
  • 108