OpenMP: parallel op and errno at once?

Question

While omp_thread_num is maintained for a full iteration, the same underlying thread won't necessarily perform the execution.

This made me wonder how omp deals non omp thread locals, i.e. __thread int; or errno, which are thread local to the underlying thread.

I can't find the information in the doc, but it seems like

#pragma omp parallel for
for (int i = 0; i < 10000; ++i) {
    // exec by omp thread 1, using underlying thread a
    fileptr = fopen(filenames[i], "rb");   
    variable_heavy_op(); // or just yield, or nothing, 

    // exec by omp thread 1, using underlying thread b, 
    if (!fileptr)  // local to omp thread 1
        perror(filename[i]);  //  // uses errno, local to underlying thread b, 
}

would risk a really painful to debug rare threading error.

errno is a catastrophic design choice, I know, but some crap is difficult to avoid. Another example would be reading the result of a try lock operation in pthreads, or using omp with any non omp threading primitives or derived libraries like the standard template library.

The question is, is my assertion correct. Or simplified. If i create a __thread variable ( non omp thread bound variable) how does that interact with the omp threadpool?

Why and how should the underlying thread change within the loop body? The thread executing the `fopen` will always be the same as the one executing the `perror` in the same iteration of the loop — Homer512, May 16 '23 at 12:17
What is the question? I really don't get where you want to go here... What are "thread a" and "thread b", and how do they start ? — PierU, May 16 '23 at 12:18
Is your question is: "If inside an OpenMP region I spawn a new thread with a non-OpenMP method, what is supposed `omp_get_thread_num()` to return if invoked in this thread?" ? — PierU, May 16 '23 at 12:25
No not spawn a new thread, but a system thread local variable. The int i is omp thread local above, what happens if i add a __thread int j? — midjji, May 17 '23 at 16:44
I think omp can use a thread pool to simulate a higher number of threads, with their context therefore associated by thread number, not system thread id. This lets it hotswap its thread local contex. — midjji, May 17 '23 at 16:47
I suspect you are making a confusion between threads (that are at a logical level) and execution cores (that are at the physical level). OpenMP can start an arbitrary number of threads, regardless the number of cores. Then the OS has possibly to switch the contexts of the cores to execute the different threads, system wide. — PierU, May 17 '23 at 17:41
One thing to note: While technically OpenMP makes, to my knowledge, no guarantees towards thread local variables created through other mechanisms than OpenMP itself, it is reasonable to assume they are all the same or at least compatible. OpenMP is a compiler extension. It would be stupid of compilers to implement the C/C++ thread local variables differently from OpenMP variables. pthread or Win32 variables might be different since these are different abstraction layers but they will be compatible. Programmers expect things to *just work*, so libraries are written with that in mind — Homer512, May 17 '23 at 20:32
If i was sitting in some tiny script thing, yes i would expect crap to work, but not be surprised when it does not. But i am trying to figure out an edge case of an edge case and the system is large enough that the team have ran into and fixed multiple bugs in everything from cmake to glibc, through gcc. It can be that you are right and openmp behaves as i would expect and there is a bug either in our code, the hardware or openmp, or it could be that there is an important difference between the expected behaviour and the actual behaviour. This is what i am trying to figure out. — midjji, May 17 '23 at 21:08
I mean, just looking at the code, the most likely place for it to fail is if anything in that `variable_heavy_op()` resets `errno`. It [could also be a signal handler](https://stackoverflow.com/questions/48378213/how-to-deal-with-errno-and-signal-handler-in-linux) — Homer512, May 17 '23 at 21:39
Yeah that would make sense in this example, it’s not errno though that’s just the one that worries me. The example I have is an error check pthread mutex trylock which is getting ebusy, when it inside the single iteration it should only be possible to get edeadlock. — midjji, May 17 '23 at 21:43
Maybe I'm wrong, but to me the threads are a concept defined at the OS level. Whatever the library/API used to create threads, there are all the same under the hood. If an OpenMP parallel region is encountered within a thread created by another libray, OpenMP will handle the existing variables as usual, wether they are threadprivate or not before entering the region. — PierU, May 17 '23 at 23:26

score 2 · Answer 1 · answered May 16 '23 at 12:33

2

There seems to be a lot going on in this question with some potential misunderstandings of how OpenMP-based threading works. Specifically, I don't see a way of how the thread executing fopen can differ from the one executing perror since both are in the same loop iteration.

However, just to clear some air about the relation of thread number and thread-local variable, note this excerpt from the OpenMP standard:

The values of data in the threadprivate variables of threads that are not initial threads are guaranteed to persist between two consecutive active parallel regions only if all of the following conditions hold:

Neither parallel region is nested inside another explicit parallel region;

The number of threads used to execute both parallel regions is the same;

The thread affinity policies used to execute both parallel regions are the same;

The value of the dyn-var internal control variable in the enclosing task region is false at entry to both parallel regions; and

No teams construct that is not nested inside of a target construct is encountered between both parallel regions.

Neither the omp_pause_resource nor omp_pause_resource_all routine is called.

If these conditions all hold, and if a threadprivate variable is referenced in both regions, then threads with the same thread number in their respective regions will reference the same copy of that variable.

In other words: Under normal circumstances, the same underlying thread with the same thread-local variables serves the same OpenMP thread number across all parallel sections

answered May 16 '23 at 12:33

Homer512

9,144
2
8
25

Ok so to clarify, i do not think that the same system thread is guaranteed to be used for an entire iteration. Further i am certain that the openmp threads do not match one to one with system threads. Just run omp for and add all system thread ids to a synchronized set and this becomes clear. Further i am not talking about the omp thread local parameters, that works as expected i am talking of mixing omp threading with non omp threading. – midjji May 17 '23 at 16:35
1

@midjji The OpenMP report only describes what hapoens with OpenMP threads. If you mix in some other threading, you are on you own. You will have to know how the particular OpenMP runtime is implemented. – Vladimir F Героям слава May 17 '23 at 16:46
1

@midjii You are asserting many things, but which are not demonstrated. Just show a code that illustrates what you mean. Note that I do not say you're wrong, but I would like too see an illustration with real code. You are talking about "system threads" as if they were pre-existing or could be started on their own, but it makes little sense to me. OpenMP creates some threads that are attached to the process (or master thread), they won't interfere with any thread that is not attached to the process. And no other thread will be attached to the process if you don't start it one way or another. – PierU May 17 '23 at 17:36
@midjji "Ok so to clarify, i do not think that the same system thread is guaranteed to be used for an entire iteration" in that case your understanding of multithreading is wrong on a fundamental level. Your mention of `yield` makes me suspect you misunderstand the relation between a software thread and a logical CPU core. Thread-local storage is **thread-local**, not **cpu-local**. That is something you can only use in the OS kernel – Homer512 May 17 '23 at 20:23
Yes i am well aware of the difference there, but no yield e.g. schedule_yeild is obviously not exclusive to os kernel, nor are cpu specific registers or cpu specific caches out of reach. But thst is not what i am talking about. – midjji May 17 '23 at 20:34
@midjji I don't mean that `yield` is exclusive to the kernel, I mean that CPU-local variables are exclusive to the kernel (since it can block thread migrations while accessing them). Your assumption that accessing the thread-local `errno` within a single loop iteration may not be safe can can only hold if it is cpu-local. A thread-local variable would remain the same – Homer512 May 17 '23 at 20:42
@midjji to be frank: Even if we don't go into the nitty-gritty of the implementation, the idea that something as simple as `errno` might not work in OpenMP is silly. Nobody would use OpenMP if you could not use 90% of all OS facilities. Even if it isn't spelled out in the standard, you can expect a quality implementation to *just work* – Homer512 May 17 '23 at 20:44
I think i need to provide a full example showing how much stranger openmp is than people think. Ill see if i can provide a minimal case proof. But im the meanwhile consider that one threading extension to a language without it, beeing perfectly compatible with a different threading extension to the same language should be expected to be incompatible. Not the opposite. – midjji May 17 '23 at 20:55
Ok homer you are apparently not aware that errno wasnt threadsafe for decades and even now is not guaranteed thread safe in general so please return to the nuclear plant and stop trying to answer. – midjji May 17 '23 at 20:58
@midjji Oh, I'm well aware that errno wasn't a thread-local variable originally. I also assume that you want your code to run on a modern system that wasn't built by someone actively malicious to their users. But yes, as others have commented, you need to demonstrate what you mean – Homer512 May 17 '23 at 21:35
We have a very different experience of C libs. I don’t think they are actively malicious, just impressively creatively bad and wondering why someone is using the crap they threw together as a ugly hack once for anything serious. – midjji May 17 '23 at 21:45
1

@midjji you really don't have to be aggressive like this towards people that try sorting out your not-so-clear question. – PierU May 17 '23 at 23:13
And homer didnt need to be condesending, but he was – midjji May 18 '23 at 19:23
I do appreciate his and especially your effort though. – midjji May 18 '23 at 20:04

score -1 · Accepted Answer · answered May 18 '23 at 20:03

-1

Openmp does not guarantee that one omp thread corresponds to one system/os thread, but it does on linux.(pthread)

openmp does not guarantee that one os thread performs an entire operation, but it should on linux as a consequence of the os openmp 2 os thread correspondence. ( This contradicts my previous assertion in the question).

__thread corresponds to threadprivate on linux for gcc, but not for icc, though there is a compat flag. The clang openmp is based on the intel one, but its unclear how it behaves, lack of warning on tests indicate it works like gcc or lacks the warning intel would give. Works like gcc seems likely.

To answer the question directly errno works on linux with gcc if its implemented as thread local, but as homer points out its likely that someone fixed it for common configurations like linux + ICC.

answered May 18 '23 at 20:03

midjji

394
3
9

Please add proper references to support this answer. To me, things like *"Openmp does not guarantee that one omp thread corresponds to one system/os thread"* or *"openmp does not guarantee that one os thread performs an entire operation"* is complete nonsense. – PierU May 18 '23 at 20:17
It is supported by the absense of the guarantee in the specification, which also has no statement to the contrary. Its difficult to cite the lack of something in a text. – midjji May 18 '23 at 22:17
OK, so your point is of the kind: "There's no proof that unicorns do not exist, hence they exist". An OpenMP thread **IS** a system thread, all threads are system threads, OpenMP is just a high-level API to create and manage thread, but under the hood OpenMP use system calls. So again *"Openmp does not guarantee that one omp thread corresponds to one system/os thread"* is non-sense. – PierU May 19 '23 at 06:48
Not the way c style specifications work unfortunately. – midjji May 20 '23 at 09:51

OpenMP: parallel op and errno at once?

2 Answers2