0

I have a problem with experiment on my computer. I've done 300 tests of parallel algorithm (32 threads) and seen, that runtime of about 10% tests is less than others. It looks like that: we have 100 tests with runtime of each about 100 ms, then we have 30 tests with runtime ~ 80 ms and again 170 tests with runtime ~100 ms. It happens every experiment. I used OpenMP, TBB, PTHREAD, std::Thread and it happens with every parallel technology. What's the reason of that?

CPU: Intel® Core™ i7 Kaby Lake H 2800 - 3800 MHz Cores: 4 Threads: 8

Also, I've done tests on another computer (Intel® Core™ m3-6Y30), but the expirement has no such problems.

I cannot show the plot of my experiment (not enough reputation), but there is a part of it in text format:

841618
846348
859046
847833
841801
847680
849084
... (about 115 tests with avg ~840000-860000 ms)
784754
784754
759525
... (about 40 tests with avg ~750000-790000 ms)
855215
846631
850249
847015
...(about 120 tests with avg ~840000-860000 ms)
778716
765774
...(about 30 tests with avg ~750000-780000 ms)

etc.

Also I have logged computer parametets like CPU temperature and power, and seen the same. So I don't know, why did the parameters behave this way. There's the code, where I've measured experiment's time (used std::chrono):

std::chrono::time_point<std::chrono::high_resolution_clock> start, end; 
std::size_t total; 
start = std::chrono::high_resolution_clock::now(); 
std::complex<double> * X = DirectTransform(compl_val); 
end = std::chrono::high_resolution_clock::now(); 
total = std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();

The compilation script is right there:

g++ ./fourier/main.cpp -o ./build/fourier.out 
g++ -pthread ./fourier-h/main.cpp -o ./build/fourier-std.out 
g++ -fopenmp ./fourier-omp/main.cpp -o ./build/fourier-omp.out 
g++ -ltbb ./fourier-tbb/main.cpp -o ./build/fourier-tbb.out 
g++ -pthread ./fourier-pth/main.cpp -o ./build/fourier-pth.out 

Hardware Overview:

  • Model Name: MacBook Pro
  • Model Identifier: MacBookPro14,3
  • Processor Name: Intel Core i7
  • Processor Speed: 2,8 GHz
  • Number of Processors: 1
  • Total Number of Cores: 4
  • L2 Cache (per Core): 256 KB
  • L3 Cache: 6 MB
  • Memory: 16 GB
  • Boot ROM Version: 185.0.0.0.0
  • SMC Version (system): 2.45f0
  • I'm afraid we need more information: Please provide at least one of your test cases as a [mcve], a description on how you measure and the specific actual results. Tell us your operating system, compiler information and flags, system as much of the system configuration as possible. – Zulan Jan 25 '19 at 15:18
  • Shot in the dark: thermal management. Can you establish core cycle counts? (Are the 6 digit numbers presented as `ms` *micro-* or *milli-seconds*?) – greybeard Feb 07 '19 at 21:05
  • @greybeard it is microseconds – Luba Philippova Feb 08 '19 at 05:05
  • I don't see that you are controlling for page alignment or checking that cache alignment is correct. – tim18 Feb 08 '19 at 12:11
  • If you are able to make the full code available on GitHub, then that might be useful,in the sense that maybe there is something else going on? – Andre M Feb 09 '19 at 17:49

0 Answers0