Multi-threading with long linked-list

Question

I have a algorithm question here. There are 10 threads in the system and you are given a link list with 10 K elements in it. How do you do the thread synchronization (addition deletion etc.) so that it is optimized for performance? Using a mutex for the list is not advisable as it will slow down performance.

Are most of the operations at the head/tail, or are the operations generally on random elements spread out over the list? Is the list circular? Is 10K the upper bound for the list size? — jxh, May 23 '13 at 10:18
No, the list is not sorted, so operations not guaranteed to be at head/tail. List is not circular. 10K is just a indicative number to say its a long list. — Adam, May 23 '13 at 10:34
The way to optimize the code depends heavily on what are all the operations performed on the list. — Dialecticus, May 23 '13 at 10:43
Please describe how exactly the list is used. How and when, and by whom, is created, is read, and is modified. And what are the relative frequencies of those operations (what occurs more frequently). — Dialecticus, May 23 '13 at 11:24
linked list data struct assumes all operations follow sequential rules. — Parag Bafna, May 23 '13 at 11:44

score 2 · Answer 1 · answered May 23 '13 at 10:37

If all the positions are accessed with the same frequency and you can modify the list node, you can add a mutex for every node:

typedef struct node{ 
   char* data;
   struct listNode *next;
   pthread_mutex_t lock; 
} listNode ;

Also depends on the size of the data of the node. If it´s very small, this may cause an overhead due to the mutex storage, creation and deletion.

If it´s an overhead or can´t modify the node, you can split the list in (for example) 100 groups of 100 elements and use a mutex for each group

This is my first thought too, but while traversing the list you must lock/unlock all the mutexes in your path, which does not seem very efficient. — Dialecticus, May 23 '13 at 10:42

score 1 · Accepted Answer · edited May 23 '17 at 12:10

1

linked list data struct assumes all operations follow sequential rules. Take a look at concurrent linked list

No matter what kind of machinery you use to implement it, the interface and expected behavior imply sequential logic.

edited May 23 '17 at 12:10

Community

1
1

answered May 23 '13 at 11:47

Parag Bafna

22,812
8
71
144

score 0 · Answer 3 · answered May 23 '13 at 10:32

0

You can use the Linux system call futex for synchronization.

answered May 23 '13 at 10:32

akhil

732
3
13

score 0 · Answer 4 · answered May 23 '13 at 10:47

It's depend on the operation that you want to do on the link list and if the list is sorted.

If you are concern that 2 thread change the value of some node, add mutex for each nore like mention here.
If you are concern about list operation (add, remove):it's depend if you do more read than write - use reader writer lock , if each thread is working on part of list than you can give remove access only to relevant thread

score 0 · Answer 5 · edited May 23 '17 at 12:25

0

I find the answer of Evans is proper. But I'd suggest to use spinlocks instead of mutexes. Spinlocks are more efficient in case of low concurrency, and short times of holding the locks.

typedef 
struct ListNode { 
   void * data;
   struct ListNode * next;
   pthread_spinlock_t lock;
}
ListNode;

edited May 23 '17 at 12:25

Community

1
1

answered May 23 '13 at 11:07

stemm

5,960
2
34
64

score 0 · Answer 6 · answered May 23 '13 at 11:39

Proper solution depends greatly of frequencies of operation on the object you want synchronized (the list in your case). Operations on the container are container creation, container modification, container traversal, and item modification. If for instance the list is mostly traversed and read from, it could be that the list is wrong container for the job. Maybe you really need some sort of map (also called dictionary), that provides really fast read access if you have a key value. In that case there is no traversal at all and it could be that synchronization of the whole map container turns to be the most efficient, simply because we changed the type of the container.

score 0 · Answer 7 · answered May 23 '13 at 12:03

Firstly assuming that the adding / removing of elements to the list is not the the reason for being multi-thread (instead the logic to determine / create these elements is the taxing process).. if list insert / remove time is the bottleneck then maybe you should reconsider your data structure.

Next, assuming that each thread will not interact with each other (one thread won't delete a record that was inserted by another) and that each thread has a finite amount of work. Have each thread not touch the actual linked list, instead have each thread maintain two supplementary lists.

It works like this:

Each thread runs and creates two supplementary lists of records to delete and insert
Given the thread is unsorted when the thread all finish we can just append the 'supplementary new items' lists for each thread to the start or end of the existing list.
Next for deleted items, we merge the list of items to delete from each thread and then we traversing the original linked list and removing the items as we encounter them (performance can be improved here by using a hashtable for the list of items to delete.

This works very well provided the two assumptions at the start hold true. Also it means there's no need for mutexes or locks, your main list is only updated at the end by a single thread after all the threads are all joined back into the main thread.

Multi-threading with long linked-list

7 Answers7