0

I ran my profiler program for 1000 iterations to calculate the average time taken to insert into Empty ArrayList and average time to insert into Non-Empty ArrayList. I ran my profiler program for ArrayList<Item>, where Item class is

class Item{
  int id;
  String name;
}

And the profiler program is something like

main(){
  final int iterations = 1000;

  /**** Empty List Insert ****/
  int id = 0;
  long[] timerecords = new long[iterations];
  ArrayList<Item> list = new ArrayList<>();
  for (int i = 0; i < iterations; i++) {
      long stime = System.nanoTime(); //Start Time
      list.add(new Item(id, "A"));
      long elapsedTime = System.nanoTime() - stime; //Elapsed time
      timerecords[i] = elapsedTime; //Record Elapsed time

      //reset
      list.clear();
  }
  System.out.println("Empty List Insert Takes = " + getAverageTime(timerecords) + " nanoseconds");

  /**** Non-Empty List Insert ****/
  list = new ArrayList<>();
  timerecords = new long[iterations];
  //Insert some Items
  list.add(new Item(id++, "B")); list.add(new Item(id++, "R"));
  list.add(new Item(id++, "H")); list.add(new Item(id++, "C")); 

  for (int i = 0; i < iterations; i++) {    
        Item item = new Item(id, "A");
        long stime = System.nanoTime(); //Start Time
        list.add(item);
        long elapsedTime = System.nanoTime() - stime; //Elapsed time
        timerecords[i] = elapsedTime; //Record Elapsed time

        //reset
        list.remove(item);
  }
  System.out.println("Non-Empty List Insert Takes = " + getAverageTime(timerecords) + " nanoseconds");
}

where getAverageTime(long [] timerecords) method returns average time in double. After running this program several times, I observed insertion into empty ArrayList takes more time than insertion into non-empty ArrayList. One such run output is

Empty List Insert Takes = 1027.781 nanoseconds
Non-Empty List Insert Takes = 578.825 nanoseconds

Whats the reason behind this ?

user1612078
  • 555
  • 1
  • 7
  • 22
  • 1
    Why don't you run it in a debugger and step through the code for ArrayList in the opensource core java libraries. You will probably see that the array list expands when you add elements to an empty one to make more space and subsequent inserts are inserted into the expanding data structures until next time you exceed its capacity then it will expand again. We could look this up for you and tell you exactly but wont you learn more by looking yourself and running it in the debugger to learn more techniques? Post your own answer to encourage others to run core lib code in a debugger... – simbo1905 Mar 07 '15 at 08:26
  • @simbo1905 Well I didn't try debugging it. Because I just wanted to measure the performance of several modules in my project that use `ArrayList`. And so I wrote this profiler program. But yeah I should try debugging it. Thanks for suggestion. – user1612078 Mar 07 '15 at 08:34
  • You are trying to create new Item Object while adding it to ArrayList in Empty List's case. That should also be taking time, while in Non-Empty List's case you are not doing that. @user1612078 – Nielarshi Mar 07 '15 at 09:05
  • `list.add(new Item(id, "A"));` takes a lot less than 500 ns but I doubt that nanotime has enough resolution to measure it properly. Micro benchmarks are difficult to get right: you should use a specialised library such as jmh instead of doing it manually – assylias Mar 07 '15 at 09:16
  • @Nielarshi In both cases, I am adding new Item object. Please have look once. – user1612078 Mar 07 '15 at 09:31
  • 1
    ` long stime = System.nanoTime(); //Start Time list.add(`new Item(id, "A")`); long elapsedTime = System.nanoTime() - stime; //Elapsed time` AND `long stime = System.nanoTime(); //Start Time list.add(item); long elapsedTime = System.nanoTime() - stime; //Elapsed time`. iN 2ND CASE you have already created Item object before you start timer. – Nielarshi Mar 07 '15 at 09:33
  • @Nielarshi Don't u see `Item item = new Item(id, "A");` and then `list.add(item);` – user1612078 Mar 07 '15 at 09:36
  • What you are not getting is the point where u create stime variable. You are certainly adding new item to the list. But in 1st case, before u do System.nanoTime(); u are calling new operator on Item while adding (after you assign stime variable) while in 2nd case, you are doind it before u r doing System.nanoTime(); – Nielarshi Mar 07 '15 at 09:39
  • I have added it as an answer to this. – Nielarshi Mar 07 '15 at 09:42

4 Answers4

2

The difference comes because of the instance creation (new Item()).

  long stime = System.nanoTime(); //Start Time
  list.add(new Item(id, "A")); // at this point two things are happening
  // creation of instance & addition into ArrayList
  // both takes their own time.

  long elapsedTime = System.nanoTime() - stime; //Elapsed time

The point where u create stime variable is the point where the timer starts. You are certainly adding new item to the list. But in 1st case, before u do System.nanoTime(); u are calling new operator on Item while adding (after you assign stime variable) which is taking that extra time. While in 2nd case, you are doing it before u r doing System.nanoTime();

    Item item = new Item(id, "A");
    long stime = System.nanoTime(); //Start Time
    list.add(item);
    long elapsedTime = System.nanoTime() - stime; //Elapsed time

so, only adding of item object to the ArrayList time is captured.

Nielarshi
  • 1,126
  • 7
  • 11
  • 1
    This was the first thing I tested as it looked obvious. But no, it has no observable effect. Even after fixing this, the results are virtually the same. – Gergely Bacso Mar 07 '15 at 09:59
1

If you look into the implementation of ArrayList - an empty ArrayList starts with a an empty array to store the data - thats easy.

But when you first call the add() method, this array needs to be copied - hence memory needs to be allocated. The ensureCapacity() and grow() methods try to minimize the creation of Arrays.copyOf(), but as long as the internal Array size is still small, more and more often this array has to be increased, which might lead to the increased duration.

Here's ArrayList.grow(int) from 1.7:

  private void grow(int minCapacity) {
    int oldCapacity = this.elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if(newCapacity - minCapacity < 0) {
      newCapacity = minCapacity;
    }

    if(newCapacity - 2147483639 > 0) {
      newCapacity = hugeCapacity(minCapacity);
    }

    this.elementData = Arrays.copyOf(this.elementData, newCapacity);
  }

You didn't measure the time a creation of an non-empty ArrayLIst takes, did you?

Alexander
  • 2,925
  • 3
  • 33
  • 36
1

I was interested, so I did some debugging on it. The effect comes from two separate sources:

As @Alexander already pointed out, constantly increasing the capacity of the Arraylist size comprimises your test. To "fix" this, you can just create your ArrayLists with sufficient provision, like:

ArrayList<Item> list = new ArrayList<>(iterations * 2);

Even after doing this one can observe a difference of 100-300% in performance between your first and second test case. The reason for that is related to Java internals, namely ClassLoading and JIT. By doing a warmUp round before your first test, something like this:

    int rounds = 10000;
    long[] t = new long[rounds];
    ArrayList<Item> warmUpList = new ArrayList<>(rounds);
    for (int i = 0; i < rounds; i++) {
        warmUpList.add(new Item(0, "A"));
        long stime = System.nanoTime(); // Start Time
        long elapsedTime = System.nanoTime() - stime; // Elapsed time
        t[i] = elapsedTime; // Record Elapsed time
    }

You can play with the optimizer. By increasing the value of the "round" variable from 100 to 10.000 you can see how your own tests are getting better and better performance, and than stabilizing at an equivalent speed.

Gergely Bacso
  • 14,243
  • 2
  • 44
  • 64
  • So, I got interest too, and tried running this on different platforms. Got this **Empty List Insert Takes = 709.0 nanoseconds Non-Empty List Insert Takes = 1430.0 nanoseconds**, now what was asked in question is opposite of what I got as result when I ran it on web compiler. – Nielarshi Mar 07 '15 at 10:45
1

Mesuring each single insertion using System.nanoTime() can results in non reliable values. It is better to do the same with System.currentTimeMillis() and mesuring the whole loop.

In addition the insert time in an ArrayList can be influenced on how many grows of the internal array are performed. So inserting in an empty ArrayList with a small internal array can results in many grows. Inserting in a non empty ArrayList with a partially empty big internal array can result in no grows, but if grows happens it is slower than for an empty array because you have to copy bigger arrays.

Davide Lorenzo MARINO
  • 26,420
  • 4
  • 39
  • 56