17

I have a code like this:

List<Listing> Listings = new ArrayList<>();
Listings.add(listing1);
Listings.add(listing2);
...
...
...

Listing listing= listings.stream()
                .filter(l -> l.getVin() == 456)
                .findFirst();

My question is what is the time complexity of the filter process? If it is O(n), my intuition is to convert it into HashSet like data structures so that the time complexity could become O(1), Is there an elegant way to do this with streams?

zonyang
  • 828
  • 3
  • 10
  • 24
  • 4
    It could be a parallel stream but the complexity would still be O(n). Converting it to a set for one filter operation would still be O(n) so you'd need to use a set in the first place. You might want to use a LinkedHashset though to keep the insertion order or a TreeSet for other ordering (a list implies some ordering and/or allowing for duplicates). – Thomas Aug 14 '17 at 23:00
  • 3
    `Stream#filter` always traverses the complete `Stream` and applies the filter criterion to **each element**. An advantage is that streams can easily be parallelized but that just reduces the time complexity by a constant factor (the amount of your physical cores). – Zabuzard Aug 14 '17 at 23:36

3 Answers3

14

It is O(n). The stream filtering uses iteration internally.

You could convert it to a map as follows:

Map<Integer, Listing > mapOfVinToListing = listings.stream().collect(Collectors.toMap(Listing::getVin, Functions.identity()); // Assuming vin is unique per listing
mapOfVinToListing.get(456);// O(1)

But, that conversion process is also O(n). So, if you only need to do this once, use the filter. If you need to query the same list many times, then converting it to a map may make sense.

You might also try using parallel streams. In some cases they may be more performant, but that depends a lot on the exact circumstances.

Adam
  • 43,763
  • 16
  • 104
  • 144
  • 1
    Or, if possible, directly use an indexing `HashMap` from the beginning to store the elements there in the first place. If you need to do such queries often then indexing the listings is definitive a better solution like @Adam said. – Zabuzard Aug 14 '17 at 23:39
  • 1
    @Zabuza Good point. If it's possible to create the collection as a HashMap, that's certainly better than converting it. – Adam Aug 14 '17 at 23:45
5

The worst case is O(n) but since Stream is lazy, if the value is found before, it'll stop the iteration. If you need constant time look up, all the time, converting to a Map is a good idea, at the cost of additional space; if the list if huge, you should consider that aspect. In fact, if the list is small, the difference between a Map and a List will be barely noticeable, unless you're working in a time-critical system.

Abhijit Sarkar
  • 21,927
  • 20
  • 110
  • 219
3

filter itself without a terminal operation would have a zero overhead - as it does absolutely nothing; streams are driven by the terminal operation only - no terminal operation, nothing gets executed.

Then comes the case that filter has to iterate over all elements (potentially all) of the source (lazily). So time complexity of filter will depend on the source that you Stream from; in your case List, so it would be O(n).

But that would be the worst case. You can't predicate the average case as far as I can see for filter in general because it depends on the underlying source.

Eugene
  • 117,005
  • 15
  • 201
  • 306