8

Since array.find() iterates over an array, if I handle (potentially) large arrays, I always make sure to have an indexed object like so:

{ [id:string]: Item }

if I need to look up items by id in these arrays.

However, living in a time of V8 (and comparable engine optimisations for Safari and Firefox), I'm wondering if maybe under the hood, a simple array.find() is already optimized for it? Or will optimize for it (create such an indexed object) at runtime as soon as it has to perform this operation once?

Is it true that modern browsers already have some kind of optimization for O(N) type algorithms that could become O(1) with the proper implementation? Or am I thinking too much of what these browsers actually can / will do under the hood?

E_net4
  • 27,810
  • 13
  • 101
  • 139
Sventies
  • 2,314
  • 1
  • 28
  • 44
  • 2
    The optimisations that modern JS engines do don't "reduce complexity" as that is not quite possible. Take the phone book and jumble all the pages. Then find a specific number. How do you optimise that task to find your target immediately without going through pages one at a time? Not really possible. V8 might optimise *repeat* searches - it might, for example, not check the first 10 pages thoroughly for names starting with "B" because those pages are later on. If you want to see if something is slow or not *for your use case*, then read https://ericlippert.com/2012/12/17/performance-rant/ – VLAZ Feb 03 '22 at 15:51
  • You're thinking too much of what the browsers will do under the hood. If you know you're going to be looking things up by id, use an object indexed by id. Guaranteed O(1) rather than depending on a particular implementation of `find`... – Heretic Monkey Feb 03 '22 at 15:51
  • The spec says the method should loop through all elements until it found one. https://tc39.es/ecma262/multipage/indexed-collections.html#sec-array.prototype.find – Luca Kiebel Feb 03 '22 at 15:52
  • 1
    If you keep "index" of (supposedly) unique (?) keys pointing at array items ("for faster lookup"), wouldn't make sense to use Map instead? – myf Feb 03 '22 at 15:54
  • 1
    @luca yeah, but unless no observable side effect is specified, as long as the result is correct the engine might still do something which is not specified under the hood (e.g. an addition is not necessarily an addition, intermediate values are not necessarily calculated ...) – Jonas Wilms Feb 03 '22 at 21:18

1 Answers1

23

V8 developer here. The time complexity of Array.prototype.find is O(n) (with n being the array's length), and it's fair to assume that it will remain that way.

Generally speaking, it's often impossible for engines to improve the complexity class of an operation. In case of Array.prototype.find, the predicate function you pass might well care how often it gets called:

[1, 2, 3].find((value, index, object) => {
  console.log(`Checking ${value}...`);  // Or any other side effect.
  return value === 42;
});

In such a case, the engine has no choice but to iterate over the entire array in exactly the right order, because anything else would observably break your program's behavior.

In theory, since JS engines can do dynamic optimizations, they could inspect the predicate function, and if it has no side effects, they could use it to build up some sort of index/cache. Aside from the difficulty of building such a system that works for arbitrary predicates, this technique even when it does work would only speed up repeated searches of the same array with the same function, at the cost of wasting time and memory if this exact same scenario will not occur again. It seems unlikely that an engine can ever make this prediction with sufficient confidence to justify investing this time and memory.

As a rule of thumb: when operating on large data sets, choosing efficient algorithms and data structures is worth it. Typically far more worth it than the micro-optimizations we're seeing so much in SO questions :-)

A highly optimized/optimizing engine may be able to make your O(n) code somewhere between 10% and 10x as fast as it would otherwise be. By switching to an O(log n) or O(1) solution on your end, you can speed it up by orders of magnitude. That's often accomplished by doing something that engines can't possibly do. For example, you can keep your array sorted and then use binary search over it -- that's something an engine can't do for you automatically because obviously it's not allowed to reorder your array's contents without your approval. And as @myf already points out in a comment: if you want to access things by a unique key, then using a Map will probably work better than using an Array.

That said, simple solutions tend to scale better than we intuitively assume; the standard warning against premature optimizations applies here just as everywhere else. Linearly searching through arrays is often just fine, you don't need a (hash) map just because you have more than three items in it. When in doubt, profile your app to find out where the performance bottlenecks are.

jmrk
  • 34,271
  • 7
  • 59
  • 74