It makes sense that the second scenario should be more effective as the loop limiter need not be calculated for every iteration in the loop. Tried to measure the CPU time used in going thru the 2 types of for loops with the code given below.
Looks like the cpu time is actually lesser for the first scenario for smaller numbers like n=25. but then for values n>=100 the second scenario gives a smaller cpu time.
clock_t start,end;
double cpu_time_used;
double n, root;
n = atoi(argv[1]);
printf("n= %0f \n",n);
start = clock();
for (int i=0; i*i<n; i++);
end = clock();
cpu_time_used = ((double) (end-start)) / CLOCKS_PER_SEC;
printf("first iter: cpu_time_used: %f \n", cpu_time_used);
start = clock();
root = sqrt(n);
for (int i=0; i<=root; i++);
end = clock();
cpu_time_used = ((double) (end-start)) / CLOCKS_PER_SEC;
printf("second iter: cpu_time_used: %f \n", cpu_time_used);
Outputs:
n= 25.000000
first iter: cpu_time_used: 0.000004
second iter: cpu_time_used: 0.000011
n= 100.000000
first iter: cpu_time_used: 0.000002
second iter: cpu_time_used: 0.000001
n= 1000000.000000
first iter: cpu_time_used: 0.000011
second iter: cpu_time_used: 0.000008