111

I just had a rather unpleasant experience in our production environment, causing OutOfMemoryErrors: heapspace..

I traced the issue to my use of ArrayList::new in a function.

To verify that this is actually performing worse than normal creation via a declared constructor (t -> new ArrayList<>()), I wrote the following small method:

public class TestMain {
  public static void main(String[] args) {
    boolean newMethod = false;
    Map<Integer,List<Integer>> map = new HashMap<>();
    int index = 0;

    while(true){
      if (newMethod) {
        map.computeIfAbsent(index, ArrayList::new).add(index);
     } else {
        map.computeIfAbsent(index, i->new ArrayList<>()).add(index);
      }
      if (index++ % 100 == 0) {
        System.out.println("Reached index "+index);
      }
    }
  }
}

Running the method with newMethod=true; will cause the method to fail with OutOfMemoryError just after index hits 30k. With newMethod=false; the program does not fail, but keeps pounding away until killed (index easily reaches 1.5 milion).

Why does ArrayList::new create so many Object[] elements on the heap that it causes OutOfMemoryError so fast?

(By the way - it also happens when the collection type is HashSet.)

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Anders K
  • 1,129
  • 2
  • 8
  • 12

2 Answers2

99

In the first case (ArrayList::new) you are using the constructor which takes an initial capacity argument, in the second case you are not. A large initial capacity (index in your code) causes a large Object[] to be allocated, resulting in your OutOfMemoryErrors.

Here are the two constructors' current implementations:

public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+
                                           initialCapacity);
    }
}
public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

Something similar happens in HashSet, except the array is not allocated until add is called.

Alex - GlassEditor.com
  • 14,957
  • 5
  • 49
  • 49
  • 2
    @Durandal: The `Map.computeIfAbsent` method sends the value of the `index` variable to the constructor in the first case, creating a large bakcing array. In the second case the zero-argument constructor is used, which does not create a backing array at all. – Lii Feb 09 '16 at 16:32
  • 9
    Ah, so the index becomes actually the initial size. Thats clears things up, thanks. – Durandal Feb 09 '16 at 16:34
  • 4
    @resueman: no, the default constructor of the Java 8 version doesn’t create a backing array until the first element will be added. See http://stackoverflow.com/a/34250231/2711488 – Holger Feb 09 '16 at 17:54
  • 1
    Thanks for the answer - it simply had not occurred to me that it would take the argument from the method in the Function and extrapolate that since it is an integer, it should might as well be used as argument in the constructor. I thought that using constructor reference would always take the empty constructor. Thanks for the education! – Anders K Feb 09 '16 at 19:58
81

The computeIfAbsent signature is the following:

V computeIfAbsent(K key, Function<? super K, ? extends V> mappingFunction)

So the mappingFunction is the function which receives one argument. In your case K = Integer and V = List<Integer>, so the signature becomes (omitting PECS):

Function<Integer, List<Integer>> mappingFunction

When you write ArrayList::new in the place where Function<Integer, List<Integer>> is necessary, compiler looks for the suitable constructor which is:

public ArrayList(int initialCapacity)

So essentially your code is equivalent to

map.computeIfAbsent(index, i->new ArrayList<>(i)).add(index);

And your keys are treated as initialCapacity values which leads to pre-allocation of arrays of ever increasing size, which, of course, quite fast leads to OutOfMemoryError.

In this particular case constructor references are not suitable. Use lambdas instead. Were the Supplier<? extends V> used in computeIfAbsent, then ArrayList::new would be appropriate.

Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334