73

I've not been able to find too much information about them online. What are they and when are they typically used?

Thanks.

Konrad
  • 39,751
  • 32
  • 78
  • 114

5 Answers5

52

An intrusive list is one where the pointer to the next list node is stored in the same structure as the node data. This is normally A Bad Thing, as it ties the data to the specific list implementation. Most class libraries (for example, the C++ Standard Library) use non-intrusive lists, where the data knows nothing about the list (or other container) implementation.

  • Thanks for the input. If I could, I would have accepted both answers. – Konrad Jul 29 '10 at 13:07
  • 17
    A non-intrusive list means the "next" pointer is on a different cache line than the data itself; so it ends up being up to 2 times slower than an intrusive list for various operations. It also means allocating "link blocks" and managing them, which increases memory consumption and alloc/free overhead. Depending on perspective, intrusive lists are good (for performance) or bad (for convenience in cases where you don't care that your end product is poor quality). – Brendan Jul 07 '16 at 06:19
  • 23
    @Brendan: That's mostly a C/Java problem. C++ doesn't have that issue; `std::list` can be implemented as `struct _Node { _Node* prev,next; T object }`. Templates combined with strong value-semantics are what makes these non-intrusive lists efficient. As you can see, there are no separate cache lines or link blocks. – MSalters Sep 12 '16 at 15:34
  • 2
    @MSalters how does a template make something more efficient? (it may be more maintainable but it wouldn't execute any faster then something implemented in C? – nhed Oct 28 '16 at 14:40
  • 8
    @nhed: The problem is that in C, you have to choose between a generic and a fast solution. Either the solution is generic (using `void*`) and you lose the locality of reference, or you rewrite your list node types for every element type. A good example is `qsort` vs `std::sort`, where C++ typically beats C by a factor of 600% on integers. Yes, an `intsort()` in C could be as fast as C++, but couldn't sort floats. – MSalters Oct 28 '16 at 19:32
  • 1
    @MSalters in C I use sys/queue.h (http://man7.org/linux/man-pages/man3/queue.3.html) to get the same performance benefits of intrusive containers. I usually try to obviously avoid sorting the data and strive for O(1) for large in-memory datasets. And as far as the specific test set you mention - have a link? I'm just curious how templates improves speed (rather that a specific algorithm is perhaps better) – nhed Oct 30 '16 at 00:51
  • 2
    @nhed `std::sort` advantage is due to inlining. qsort receives type-erased pointer to function acting on void*, while std::sort (due to templates) knows the types of its arguments and more importantly the function being used in sorting (if it's in the same TU), so if it's small enough (and more often than not - it is) it can just inline in and get rid of call overhead. – Dan M. Aug 20 '19 at 16:47
  • 1
    @MSalters You can implement an intrusive generic list in C pretty easily using pointer offsets. Look at the lists implemented in the Linux kernel. The main advantage is zero allocation for all list operations. – Pavel Šimerda Jan 02 '20 at 17:09
  • 1
    @Pavel Šimerda, I'd love to see that if you have a link. (pointer offsets, intrusive generic list in C, linux kernel). Thank you! –  May 15 '20 at 16:27
47

I actually like the intrusive model.

  1. It's better on memory (not many small allocations for things to point at other things)
  2. It allows you to have an object that exist in multiple containers at once.
  3. It allows you to find an element with one search mode (hash) but then find the next element in lexographic order
    • This is not the same as #2, but it can be accomplished with boost's multi_index_container, but note that multi_index_container has certain shortcomings that are non-issues with intrusive containers.

Intrusive is GOOD

...you just need to know what you are doing (which is true for any container).

Olivia Stork
  • 4,660
  • 5
  • 27
  • 40
nhed
  • 5,774
  • 3
  • 30
  • 44
  • 1
    @Andrew obviously helpful answer to many (compared to top answer) without being a paste of a manual page. It does address why/when you would use it (2nd part of question). Enjoy the rest of your day – nhed Feb 22 '18 at 19:11
  • 18
    Answer is practically a non sequitur, downvoting. – The Dembinski Dec 24 '18 at 06:24
  • 1
    Point #2 here seems exactly wrong: an data in an intrusive list can only be in one container, because the data object only has one next pointer to serve as "the container." – Ned Batchelder May 01 '21 at 16:36
  • 1
    @ned-batchelder you can have as many next pointers as you want in an intrusive data node. in addition to your node's data members you can have 1 or more list nodes each with their own next/prev links. – nhed May 03 '21 at 11:55
  • Sure, but this list should be "advantages of the intrusive model over the non-intrusive". Intrusive lets an object be part of multiple collections, but only a pre-planned number of specific collections. Non-intrusive is much better at allowing an object to exist in multiple containers at once. – Ned Batchelder May 03 '21 at 18:14
  • Yes "pre-planned", if you are suggesting an edit suggest the edit but I did not see what I wrote as misleading. Especially the benefit over the non-intrusive is made clear in the next bullet. Say I was implementing an ARP table and given a MAC address (of an existing entry) I wanted to get the next MAC address in lexicographic order it would be an O(1) operation. [with the entries having two intrusive link nodes one for a "global" lexicographic list and one for an in-bucket hash list]. – nhed May 03 '21 at 18:35
  • also note 3rd bullet note about boost's multi_index_container ... at least when I wrote this answer (10 years ago) it had awful delete performance. I remember having to swap my use of those out in favor of boost intrusive as it had real-life impact on my users. While I never looked under the hood od boost_multi_index I presume its because there was no O(1) way to get to the data'nodes multiple link-node pointers. deletion on (yes pre-planned) multiple intrusive containers - O(1) – nhed May 03 '21 at 21:56
30

It surprising how so many people get this completely wrong (such as the answer from Ziezi). People seem to over-complicate things when it's really pretty simple.

In an intrusive linked list there is no explicit 'Node' struct/class. Instead the 'Data' struct/class itself contains a next and prev pointer/reference to other Data in the linked list.

For example (intrusive linked list node):

struct Data { 
   Data *next; 
   Data *prev; 
   int fieldA; 
   char * fieldB; 
   float fieldC; 
}  

Notice how the next and prev pointers sit alongside and intrude on the private data fields of the entity such as fieldA. This 'violates' the separation of concerns enforced by standard linked lists (see below) but has benefits in greatly reducing the amount of list walking to locate specific nodes as well as lower memory allocations.

In an intrusive linked list, the 'linked list' itself is often virtual, there is normally no need to create a linked list struct/class at all.

Instead you can simply store a head pointer to the first Data item in some owner/manager. This manager also contains Add/Remove functions to update pointers as needed. For more info see https://gameprogrammingpatterns.com/spatial-partition.html

Having a single pair of next/prev pointers dictates that each object can only belong to one list. However you can of course add multiple pairs of next/prev pointers as needed (or define an array of next/prev pointers) to support objects in multiple lists.

In a non-intrusive (ie standard) linked list the next/prev pointers are part of a dedicated 'node' entity and the actual Data entity simply a field in that node.

For example (non intrusive linked list node and data):

struct Data { 
   int fieldA; 
   char * fieldB; 
   float fieldC; 
}  

struct Node { 
   Node *next; 
   Node *prev; 
   Data *data; 
}  

Notice how the next/prev pointers do not intrude on the actual Data entity and the separation of concerns is maintained.

Update:

You may see other sites such as https://www.data-structures-in-practice.com/intrusive-linked-lists/ use a 'List' struct (actually a Node) that contains next/prev pointers and is the single intrusive field in the 'Data' struct/class.

This does hide the next/prev pointers from the Data, however it suffers from the need to perform pointer arithmetic simply to access the actual Data associated with the List (Node).

This approach adds needless complexity in my option (over simply embedding next/prev fields directly) just for the the dubious goal of hiding the next/prev pointers. If you need intrusive lists, keep them simple as possible. (Also, in managed memory languages it is difficult or impossible to do pointer arithmetic anyway.)

Ash
  • 60,973
  • 31
  • 151
  • 169
  • Your `Data` class has the same number of memory allocations as Ziezi's. His `T data` member is not a `std::unique_ptr data`; it's allocated contiguously with `prev` and `next` just as in your example. – MSalters Nov 21 '19 at 09:21
  • Sure, I only mention Ziezi answer for the misunderstanding around intrusiveness, not comparing number of memory allocations directly. I am bit rusty on C++ memory allocations though, being only a poor managed memory developer these days:), thanks for the info. – Ash Nov 21 '19 at 11:54
  • 1
    You can (and maybe should) have an explicit list node structure for intrusive lists. See the implementation in the Linux kernel. – Pavel Šimerda Jan 02 '20 at 17:15
5

Intrusives lists are lists where objects are themselves heads or cells of lists. They are good or bad things depending of context.

Inside some defined module (unsecable group of classes working together) it may be the BEST mean to tie relationships between classes. They allow no-cost direct and full management of common relationships like unicity (ex: apples does not apears two times in appletrees, and this does not need any key for this, and apples does not belong to two distincts trees), they are navigable in both directions (direct accès to appletree given an apple and to apples given some appletree). All basic operations are O(1) (no search in some external container).

Intrusive list are VERY BAD between two Modules. Because they will be tied together, and modules justification is management of code independance.

alta
  • 57
  • 1
  • 3
  • Good point on how intrusive linked lists are sometimes the best way to model relationships. Also avoiding them between two modules. +1 – Ash Nov 21 '19 at 08:44
4

Here is a brief description that is valid for lists as well:

I. Intrusive containers.

Object to be stored contains additional information to allow integration in container. Example:

struct Node
{
    Node* next;   // additional
    Node* prev;   // information 
    T data;
} 

1. Pros:

  • stores the objects themselves.

  • doesn't involve memory management.

  • iteration is faster.
  • better exception guarantees.
  • predictability in insertion and deletion of objects. (no additional (non-predictable) memory management is required.)
  • better memory locality.

2. Cons:

  • contains additional data for container integration. (every store type must be adapted (modified) to the container requirements.)
  • caution with possible side effects when changing the contents of the stored object.(especially for associative containers.)
  • lifetime management of the inserted object, independently from the container.
  • object can be possibly disposed before erased from the container leading to iterator invalidation.
  • intrusive containers are NON-copyable and NON-assignable.

II. Non-instrusive containers (C++ standard containers)

Object doesn't "know" and contain details about the container in which is to be stored. Example:

struct Node
{
    T data;
}

1. Pros:

  • does not containe additional information regarding the container integration.
  • object's lifetime managed by the container. (less complex.)

2. Cons:

  • store copies of values passed by the user. (inplace construction possible.)
  • an object can belong only to one container. (or the contaier should store pointers to objects.)
  • overhead on storing copies. (bookkeeping on each allocation.)
  • non-copyable or non-movable objects CAN'T be stored in non-intrusive containers.
  • can't store derived object and still maintain its original type. (slicing - looses polymorphism.)
Ziezi
  • 6,375
  • 3
  • 39
  • 49
  • 1
    nice necro. Regarding your 'cons' in the Non-instrusive segment; Point 1 - not necessarily, inplace construction possible, ie 'emplace' Point 4 - Flat out false – sp2danny Jun 15 '17 at 11:14
  • @sp2danny I've updated it relfecting your remarks, thank you! – Ziezi Jun 15 '17 at 11:49
  • There's an error in the cons section: An object can only belong in one container in the intrusive, not the non-intrusive case. The next/prev pointers will only be relevant for one list. – Antony Riakiotakis Jun 30 '17 at 13:36
  • "iteration is faster." - not in most cases: you've both increased the number of pointers needed per object, and reduced the ability for iteration-efficient memory management. I would remove this point entirely, or else provide a much more nuanced description of what you mean. – Adam Nov 05 '17 at 17:17
  • 13
    -1: This is very confusing. The `Node` structure of the intrusive containers, as you write it, is identical to a typical linked list node, templatized at `T`. You should have explained that in a intrusive lists the object type inherits the list node type, effectively coupling with the intrusive container type. – ceztko Jan 17 '18 at 22:46
  • @ceztko the critical thing here is the information needed linking with rest of the nodes and its location with regard to the node, i.e. inside the node or outside of it. The templated data type `T` was inserted for generality and is of secondary importance. Yes, you are right the specific example used is a node of linked list. I accept your comment as valid and its existence will help for future possible confusions. Thank you! – Ziezi Jan 17 '18 at 23:58
  • 3
    Not only very confusing, but plain wrong. The first code example is just a classic NON intrusive node structure. – Ash Nov 21 '19 at 08:36
  • @AntonyRiakiotakis You can put an item to multiple (predefined number of) intrusive lists as well. – Pavel Šimerda Jan 02 '20 at 17:13