Assuming there is a simple "which thread finishes loop first" benchmark,
#include<thread>
#include<iostream>
#include<mutex>
int main()
{
std::mutex m;
std::thread t1([&](){
auto c1=clock();
for(int i=0;i<1000000;i++){ /* some unremovable logic here */ }
auto c2=clock();
std::lock_guard<std::mutex> g(m);
std::cout<<"t1: "<<c2-c1<<" "<<std::endl;
});
std::thread t2([&](){
auto c1=clock();
for(int i=0;i<1000000;i++){ /* some unremovable logic here */ }
auto c2=clock();
std::lock_guard<std::mutex> g(m);
std::cout<<"t2: "<<c2-c1<<" "<<std::endl;
});
t1.join();
t2.join();
return 0;
}
can we trust clock()
or any other time/clock request function to be not serialized between threads and be always independent so that measuring it won't change the order which thread completes work?
If there is single clock cycle counter for whole CPU, how does C++ count it per thread? Does it simply broadcast same data if multiple threads query it at the same time? Or does it serialize operations in micro-operations behind to serve one thread at a time?
Above code compiles and gives this result(with if(t1.joinable())
and if(t2.joinable())
):
t1: 2
t2: 3
does this mean thread 1 absolutely completed first or did it actually complete later but clock was requested for it first so that thread 2 got a lag?
Without checking if they are joinable:
t1: 1
t2: 1