Are successive for loops over the same iterable slower than a single loop?

Question

I often see, and have an inclination to write code where only 1 pass is made over a for loop, if it can be helped. I think there is a part of me (and of the people whose code I read) that feels it might be more efficient. But is it actually? Often, multiple passes over a list, each doing something different, lends to much more separable and easily understood code. I recognize there's some overhead for creating a for loop but I can't imagine it's significant at all?

From a big-O perspective, clearly both are O(n), but if You had an overhead* o, and y for loops each with O(1) operations vs 1 for loop with O(y) ops:

In the first case, y•O(n+o) = y•O(n+nc) = O(yn+ync) = O(yn) + O(ync) = O(yn)+ y•O(o)
In the second, O(ny+o) = O(yn)+O(o)

Basically they're the same except that the overhead is multiplied by y (this makes sense since we're making y for loops, and get +o overhead for each).

How relevant is the overhead here? Do compiler optimizations change this analysis at all (and if so how do the most popular languages respond to this)? Does this mean that in loops where the number of operations (say we split 1 for loop into 4) is comparable to the number of elements mapped over (say a couple dozen), making many loops does make a big difference? Or does it depend by case and is the type of thing that needs to be tested and benchmarked?

*I'm assuming o is proportional to cn, since there is an update step in each iteration

You almost answered your own question: the overhead of doing `y` passes is generally quite big since the `o` is theoretically proportional to `cn`. However, this is *much more complex* in practice. Sometimes, there is no overhead and sometimes it is significantly faster to use multiple passes. In some rare case, the overhead can be much bigger than the theoretical one. The memory hierarchy, the low-level processors architecture and compiler optimizations plays a huge role in this. This is *highly dependent of the use-case* (and every detail matters). — Jérôme Richard, Dec 22 '21 at 02:00
Note that doing a benchmark is a good thing for a given architecture, but you should not generalize the outcome because the result will be impacted by details of the underlying architecture. There are cases where a 2-pass code can be slow on one machine but fast on another and where a 1-pass gives the opposite result! Thus, if you want to study this deeply, please choose a well-defined use-case. — Jérôme Richard, Dec 22 '21 at 02:05
@JérômeRichard This was a great response! Much appreciated -- it's sort of what I suspected, but I'm glad to hear what sounds like an informed opinion on it. I would like a sense of what might be a good way to write code "by default", without much attention to context, but it makes sense that there is no best practice and it is very situation dependent, and that a neat answer likely doesn't exist — David Lalo, Dec 22 '21 at 04:58

Are successive for loops over the same iterable slower than a single loop?

0 Answers0