I profiled two programs by using Intel Vtune one that is optimized and the other is not, and the results were a little weird, the Instructions Retired in both were about 7,400,000, and in the CPI the rate of the optimized program was higher than the un-optimized program! So, can anyone help me understand this?
Asked
Active
Viewed 123 times
1
-
1How many samples have you collected? Make sure there are reasonable number of samples collected in both cases by running workload for at least a sec. Please post the code as well, if possible. – Elalfer Mar 05 '14 at 00:17