I am looking for some scheduling options based on data accessed by threads. Is there any way to find the pages of cache accessed by a specific thread. If i have two threads from two different processes, is it possible to find the common data/pages accessed by both the threads
-
I am using intel xeon x5675 cpu with linux. It has performance monitoring unit and HPCs. – naran Apr 15 '13 at 06:19
-
what are you trying to achieve? – didierc Apr 15 '13 at 16:15
-
@didierc if two threads are too intimate(sharing a lot of data), they may be scheduled in a same cpu core or like that. I need to know the sharing pattern of threads – naran Apr 23 '13 at 03:49
-
what do you mean by page of cache? cpu cache? – didierc Apr 23 '13 at 10:09
1 Answers
Two threads from the same process are potentially sharing the whole process memory space. If the programme does not restrict access to certain regions of memory to the threads, it might be difficult to know exactly which thread should be assigned to which cpu.
For more threads, the problem becomes more difficult, as a thread may share different data with several different threads, and create a network of relationship. If a relation between two threads mandates their affinity to a given cpu core, by transitivity all the threads of their relationship network should be bound to that very same core as well. Perhaps the number of relations, or some form of clustering analysis (biconnectivity) would help.
Regarding your specific question, if two threads are sharing data, but are from different processes, then these processes are necessarily sharing these pages voluntarily by using shm_open
(to create a shared memory segment) and mmap
(to map that segment in the process memory). It is not possible to share data pages between processes otherwise, excepted implicitely (again) with the copy on write mechanism used by the OS for forked processes, in which case each page remains shared until one process makes a write to it.
The explicit sharing of pages (by shm_open
) may be used to programmatically define the same CPU affinity for both threads - perhaps by convention in both programs to associate the relevant threads with the first core, or through a small handshaking protocol established at some point through the shared memory object (for instance the first byte of the memory segment could be set to the chosen cpu number + 1 by the first thread to access it, 0 meaning no affinity yet).
Unfortunately, the posix thread API doesn't provide a way to set cpu affinity for threads. You may use the non portable extension provided on the linux platform pthread_attr_setaffinity_np
, with the cpuset
family of functions to configure a thread affinity.
references:

- 14,572
- 3
- 32
- 52
-
@ didierc even though threads that share data can lead into a network of threads, i am concerned only on threads which are intimately sharing data among themselves. I fix some threshold to bring these thread clusters. Is there any dynamically from a process read data access information of other threads – naran May 02 '13 at 08:50
-
I added the case of shared pages in the context of forked processes which I did not consider. I think your question really narrows down to the explicit sharing with `shm_open` situation though. – didierc May 02 '13 at 09:32