0

I'm trying to create my own data-structure that can determine if a value is an element within it in O(1) time with a hash-map.

I'm working on adding more member functions so that it's more similar to the STL containers. Here's what I currently have that's relevant to the problem:

template <class T>
class setfind {

private:
    long long _size = 0;
    unordered_map<T, bool> _set;

public:
    // constructors
    setfind() {}
    // initialize from some iterable
    template <class InputIterator>
    setfind(InputIterator beg, InputIterator en){
        _size = en - beg;
        while (beg != en){
            _set[*beg] = true;
            beg++;
        }
    }

    bool contains(const T &val){
        return _set[val];
    }
    bool contains(const T &&val){
        return _set[val];
    }
};

As you can see, its core is the unordered_map. I want to write a member function that returns a begin iterator to the _set variable. I tried putting this in:

template <class InputIterator>
InputIterator begin()
{
    return _set.begin();
}

but that led to a compilation error saying that there was no matching member function. I don't know enough about iterators and template-classes to fix this. Does anyone know how to implement it or what's wrong? Are there any good resources so that I can learn more about this?

Also some optimization tips for this class would be appreciated because this is going to be used under a time-limit.

EDIT: I’m restricted to using c++11
EDIT2: Fixed a bug in the constructor
EDIT3: Memory leaks and best-practices will not be an issue

feverdreme
  • 467
  • 4
  • 12
  • 2
    Why not just use an `unordered_set`? – Kevin Nov 24 '20 at 23:33
  • @Kevin because as far as I’ve researched it has worst-case O(n) and I want to guarantee that it’s O(1) – feverdreme Nov 24 '20 at 23:36
  • `unordered_set` and `unordered_map` have essentially identical complexity: O(1) expected, but O(n) worst case for both. – Jerry Coffin Nov 24 '20 at 23:48
  • In most standard library implementations, `unordered_set` and `unordered_map` are built on top of the same hashtable internals. By the standard, they're required to do linked lists from each point in the table. The big difference is that map stores `std::pair` and does equality and hash calculations on just the key, while set just stores values and directly compares them. – Charlie Nov 25 '20 at 05:09

1 Answers1

2

I see several issues with your code.

You don't need the _size member at all, get rid of it and use _set.size() when needed.

In your constructor, while (*beg != *en) needs to be while (beg != en) instead. Or better, just geet rid of the manual loop altogether and use the std::unordered_map constructor that takes an iterator pair as input:

// initialize from some iterable
template <class InputIterator>
setfind(InputIterator beg, InputIterator en) : _set(beg, en) {}

In your contains() methods, use _set.find() or _set.contains() instead of _set.operator[] to avoid val begin added to the _set if it does not already exist. Also, it does not make sense to take a const rvalue reference, so just get rid of that overload altogether:

bool contains(const T &val) const
{
    return _set.find(val) != _set.end();
    // or:
    // return _set.contains(val);
}

And lastly, for your begin() method, just use auto for the return type instead of a template, let the compiler deduce the necessary type, eg:

auto begin()
{
    return _set.begin();
}

UPDATE: apparent auto return type deduction was introduced in C++14. So, for C++11, you will just have to state the type explicitly, still don't use a template for it, eg:

unordered_map<T, bool>::iterator begin()
{
    return _set.begin();
}

Or:

auto begin() -> unordered_map<T, bool>::iterator
{
    return _set.begin();
}

Or:

auto begin() -> decltype(_set.begin())
{
    return _set.begin();
}

You can simplify this by declaring an iterator alias in your class (which you should do anyway, if your goal is to make your class act like a standard container), eg:

template <class T>
class setfind {
private:
    unordered_map<T, bool> _set;

public:
    using iterator = unordered_map<T, bool>::iterator;

    ...

    iterator begin(){
        return _set.begin();
    }
};
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • Since I'm limited to c++11, I don't have access to the _set.contains method or return value deduction – feverdreme Nov 25 '20 at 00:01
  • @feverdream "*I don't have access to the _set.contains method*" - that is why I also suggested `_set.find()`. "*or return value deduction*" - then you are going to have to specify the type explicitly, not with a template. I have updated my answer to show that. – Remy Lebeau Nov 25 '20 at 00:14
  • a quick question: why *does* the documentation say that unordered_map::find is best case O(1) and worst case O(n)? Shouldn't it always be O(1) since you just need to hash the value or am I being really dumb here – feverdreme Nov 25 '20 at 00:30
  • @feverdream you are assuming the implementation of the search uses hashes. It is not required to, but that would be ideal. But it *can* use a linear search instead, if it wants to. – Remy Lebeau Nov 25 '20 at 00:41