Why is array.includes an order of magnitude faster than set.has in javascript?

Question

Well having grown up in C++, I am always conscious about what algorithm would fit what. So when I notice the application started to behave sluggish on mobile phones I immediately started looking at the data structures and how they are represented.

I am noticing a very strange effect Array.includes is an order of Magnitude faster than Set.has. Even though Set.has much more potential to be optimized for lookup: it's the whole idea of using a set.

My initialize code is (this code is outside the timing for the tests):

function shuffle(a) {
    for (let i = a.length - 1; i > 0; i--) {
        const j = Math.floor(Math.random() * (i + 1));
        [a[i], a[j]] = [a[j], a[i]];
    }
}

const arr = []
for (let i = 0; i < 1000; i+=1) {
    arr.push(i);
};

shuffle(arr);
const prebuildset=new Set(arr);

And the tests are:

(new Set(arr)).has(-1); //20.0 kOps/s
arr.includes(-1); //632 kOps/s
(new Set(arr)).has(0); //20.0 kOps/s
arr.includes(0); //720 kOps/s
prebuildset.has(-1); //76.7 kOps/s
prebuildset.has(0); //107 kOps/s

Tested with chrome 73.0.3683.103 on Ubuntu 18.04 using https://jsperf.com/set-array-has-test/1

I kind of can expect the versions that create a set on the fly to be slower than directly testing an array for inclusion. (Though I'd wonder why chrome doesn't JIT optimize the array away - I've also tested using a literal array and a literal array vs using a variable doesn't matter at all in speed). However even prebuild sets are an order of magnitude slower than an array-inclusion test: even for the most negative case (entry not inside the array).

Why is this? What kind of black magic is happening?

EDIT: I've updated the tests to shuffle the results so as to not skew too much to an early stoppage for array.includes()- While no longer 10 times as slow it is still many times slower, very relevant and out of what I expect it to be.

How much of the performance difference can be attributed to that `new` keyword? I don't think you're comparing apples and apples. — Robert Harvey, Apr 11 '19 at 19:30
@RobertHarvey indeed a lot - as one can see in the difference between the 1st/3rd and last two tests. - But the last two tests move the set construction outside of the actual test - and they are still an order of magnitude slower. (And memory is actually still being used in the other tests). — paul23, Apr 11 '19 at 19:34
Measuring performance like this is a finicky business. Check out https://jsperf.com/set-array-has-test/3 , I simply changed the size of the array to 100000, and results are wildly different. — Aioros, Apr 11 '19 at 19:39

score 5 · Accepted Answer · answered Apr 11 '19 at 20:11

I'll start by stating that I'm not an expert on JavaScript engine implementations and performance optimization; but in general, you should not trust these kind of tests to give you a reliable assessment of performance.

Time complexity of the underlying algorithm only becomes a meaningful factor over very (very) large numbers, and as a rule of thumb, 1000 is certainly not such a large number, especially for a simple array of integer values.

Over a small amount of millisecond-timed operations, you are going to have many other things happening in the engine at a similar time scale that will throw your measurements off wildly. Optimizations, unexpected overheads, and so on.

As an example, I edited your tests by simply increasing the size of the array to 100,000. The results on my poor old laptop look like this:

arr.includes(-1); //3,323 Ops/s
arr.includes(0); //6,132 Ops/s
prebuildset.has(-1); //41,923,084 Ops/s
prebuildset.has(0); //39,613,278 Ops/s

Which is, clearly, extremely different from your results. My point is, don't try to measure microperformance for small tasks. Use the data structure that makes the most sense for your project, keep your code clean and reasonable, and if you need to scale, prepare accordingly.

Why is array.includes an order of magnitude faster than set.has in javascript?

1 Answers1

Linked