I was writing a piece of multithreaded code when I found that a for loop was not terminating. The starting code was approximately like this:
for(int i = V-1-tid; i >= 0; i-=NTHREADS){
*/ stuff */
}
V and NTHREADS are constants and tid is the thread id passed using pthread_create
.
I then removed everything from the loop and wrote something like this to make sure nothing was interfering with i
:
for(int i = 0; i<100; i++){
std::cout<<i<<"<100? "<<(i<100)<<std::endl;
}
This still does not stop.
I spawn the threads using a simple:
for(int i = 0; i < NTHREADS; i++){
pthread_create(&(threads[i]), NULL, foo, &(parameters[i]));
}
I tried declaring i
as volatile
, but this changed nothing.
If I compile with -O0
then the loop stops correctly, but everything above -O0
has the same problem.
I am using gcc 9.4.0
, more specifically g++-9 (Homebrew GCC 9.4.0) 9.4.0
, and the flags I am using are:
-O3 -mavx -mavx2 -mfma -std=c++11 -march=native -fno-rtti -lquadmath -lpthread -g
I am currently looking through the assembly output from gcc to see what's happening, but understanding optimized x86 is a bit of a pain.
Am I missing something obvious? Is there anything I can try?
Edit: Added example.
EXAMPLE CODE:
#include <iostream>
#include <pthread.h>
#define NTHREADS 1
void *foo(void *args){
for(int i = 0; i < 100; i++){
std::cout<<i<<std::endl;
}
}
int main(){
pthread_t threads[NTHREADS];
for(int i = 0; i < NTHREADS; i++){
pthread_create(&(threads[i]), NULL, foo, NULL);
}
for(int i = 0; i < NTHREADS; i++){
pthread_join(threads[i], NULL);
}
}
The output I am getting can be seen here: godbolt.org/z/Mfjrj6Khr