0

I'm trying to implement my own list and map, and I've been having trouble with the end() method since these containers aren't contiguous in memory.

After some research, I've ended up implementing the end() as a method that returns an iterator that holds nullptr. (In what way does end() point to 'one past the end' in non-contiguous containers?). That is enough for most of my implementation, as it verifies that list::begin() == list::end() when list is empty (empty list has no nodes so head points to nullptr).

However, I've seen that it's possible to call operator--() on past-the-end iterators to point to the actual last element of the container.

int main() {
  {
    std::map<int, char> m = {{1, 'a'}, {2, 'b'}, {3, 'c'}};
    auto it = m.end();
    --it;
    std::cout << it->first << "\n";  // output: 3
    std::cout << it->second << "\n"; // output: c
  }
  {
    std::list<int> l{ 1, 2, 8};
    auto it = l.end();
    --it;
    std::cout << *it << "\n"; // output: 8
  }
}

How is this achieved? The implementation of operator--() of _List_iterator in the gcc changes the current node to point to the previous one:

      _Self&
      operator--() _GLIBCXX_NOEXCEPT
      {
    _M_node = _M_node->_M_prev;
    return *this;
      }

But how is this possible if the node is out of the list? How does that node even exist? Does this mean that an extra node is allocated? Are you actually allowed to call operator--() to a past-the-end iterator?

These are the begin() and end() methods of gcc's std::list, in case it helps:

     /**
       *  Returns a read-only (constant) iterator that points to the
       *  first element in the %list.  Iteration is done in ordinary
       *  element order.
       */
      _GLIBCXX_NODISCARD
      const_iterator
      begin() const _GLIBCXX_NOEXCEPT
      { return const_iterator(this->_M_impl._M_node._M_next); }

      /**
       *  Returns a read/write iterator that points one past the last
       *  element in the %list.  Iteration is done in ordinary element
       *  order.
       */
      _GLIBCXX_NODISCARD
      iterator
      end() _GLIBCXX_NOEXCEPT
      { return iterator(&this->_M_impl._M_node); }
François Andrieux
  • 28,148
  • 6
  • 56
  • 87
  • 8
    nothing prevents the implementation to create an extra node (that has link to the last node as `prev`) and have the iterator point to it. You as a user are not allowed to access that node, just the iterator, but the implementation can. – bolov Jul 06 '23 at 19:11
  • 5
    This can be done by adding so called [sentinel nodes](https://en.wikipedia.org/wiki/Sentinel_node). But you should also not think of iterators as being pointers they are not (at least not always), they can be objects in their own right so they can have state indicating they are not referring to anything inside the container. – Pepijn Kramer Jul 06 '23 at 19:26
  • 1
    `struct NodeBase { NodeBase *prev, *next; } sentinel; struct Node : NodeBase { T data; }` may be useful. – Ted Lyngmo Jul 06 '23 at 19:45
  • *the end() as a method that returns an iterator that holds `nullptr`.* No, it does not hold `nullptr`. GNU STL `list` returns the `end()` iterator referencing the last list item. – 273K Jul 06 '23 at 20:34
  • @273K: no, the GNU `list` returns the `end()` iterator referencing the *sentinel*. (Also, it's not the STL) – Mooing Duck Jul 06 '23 at 20:35
  • 2
    Anyway, GNU list is a cycle `__first <-> ... <-> __last <-> end() <-> __first` trough `_M_next` and `_M_prev`. – 273K Jul 06 '23 at 20:40
  • @MooingDuck There is no *sentinel* in GNU list, `end()` is referencing `__last._M_node`. – 273K Jul 06 '23 at 20:47
  • I've checked and `(--l.begin() == l.end())` returns `true` for `std::list` (gcc 13.1.1) so I guess it's a cycle. – edugomez102 Jul 06 '23 at 21:09

2 Answers2

1

But how is this possible if the node is out of the list?

Simply, _M_node->_M_prev points to the last node.

How does that node even exist?

In the same way that any node exists. In the case of standard list, it would be created with the provided allocator.

Does this mean that an extra node is allocated?

That seems like a reasonable conclusion. This pattern is called "sentinel node".

Are you actually allowed to call operator--() to a past-the-end iterator?

Yes.

eerorika
  • 232,697
  • 12
  • 197
  • 326
1

The standard containers use a bit of a dirty trick (or clever, depending on the way you look at it). (Simplifying the code somewhat).

You start out with an 'base' node type. This is just your forward and backwards pointers.

struct ListNodeBase {
  ListNodeBase* next;
  ListNodeBase* prev;
};

The list node extends the base node type with the actual data to store. This separation between the pointers and the data is kinda useful!

template<typename T>
struct ListNode : public ListNodeBase {
  T data;
};

Iteration across the linked list ONLY uses ListNodeBase. To access the data, the objects are upcast to ListNode.

template<typename T>
struct ListIter {
   ListNodeBase* current;

   // Here however we upcast to the node type to access the data. 
   T& operator*() const
      { return static_cast< ListNode<T>* >(current)->data; }
   T* operator->() const
      { return &static_cast< ListNode<T>* >(current)->data; }

   // just your bog standard list traversal.. 
   // again, this only operates on the 
   ListIter<T>& operator++() {
      current = current->next;
      return *this;
   }

   ListIter<T>& operator--() {
      current = current->prev;
      return *this;
   }
};

The trick though is that there is one node of ListNodeBase (which is the list container), and the rest of the nodes are ListNode.

template<typename T> 
struct list : private ListNodeBase {

   typedef ListIter<T> iterator;


   list() 
   {
     // when empty, store pointers to itself 
     this->next = this;
     this->prev = this;
   }

   iterator begin() 
     { return (iterator){ this->next }; }

   iterator end() 
     { return (iterator){ this }; }
};

In other words, the list container is the sentinel node (It just doesn't store any data, because it uses the base node type, and node the actual node).

robthebloke
  • 9,331
  • 9
  • 12
  • I'd be careful with statements like _"The standard containers use..."_ since there are many standard library implementations around and they are not required to use this technique. It's a very useful answer to OPs question nevertheless. – Ted Lyngmo Jul 07 '23 at 16:58