2

Hopefully this is not a duplicate.

Before anything, I know ArrayList are not the best choice but this is just curiosity.

Simply, I was wondering about the implementation of ArrayList. I looked and figured out it uses an array for storage.

For array, when you have:

int [] arr;

arr points to the first element of the array and because it is typed as integer, the compiler knows where to jump:

arr[2] => arr value + 2 * typeof(int) = address of arr[2]

Now, since ArrayList are typeless, I was wondering how the compiler can figure out where is the next item. I would guess there is an overhead that tells what is the data so that the compiler can perform the pointer arithmetic.

As a result, ArrayList should be way slower than any other typed collection since it cannot just jump to the data as it needs to know what is before. This would be quite similar to LinkedList.

Everts
  • 10,408
  • 2
  • 34
  • 45
  • 3
    Why not look for yourself. There are plenty of decompilers that will decompile to C#. Reflector is one. – Oded Jan 09 '13 at 10:18
  • 1
    `typeof` is not the same as `sizeof` – leppie Jan 09 '13 at 10:19
  • Didn't get what jumping you are talking about? ArrayList uses array of objects. Getting second item is `_items[1]` – Sergey Berezovskiy Jan 09 '13 at 10:22
  • There is no magic: `object[] _items` are used as items storage. Thus, each element offset is always equal to `sizeof(item reference)` where `item reference` is a reference to class instance or boxed value – DmitryG Jan 09 '13 at 10:23
  • @fafase - We expect people to make an effort _before_ asking a question. Finding out how something was implemented is simple enough using a decompiler. – Oded Jan 09 '13 at 11:02
  • @Oded - you are right, why is anyone asking questions here since they all could figure it out by themselves. leppie - yep should be sizeof,my bad... – Everts Jan 09 '13 at 11:02
  • As I said, I tried to look for it. I do not have ilasm on this computer so I just thought someone who knows would nicely take 2 sec to answer as it was done by Nicholas below. Now it is also possible for you to ignore questions. – Everts Jan 09 '13 at 11:07

3 Answers3

6

An ArrayList only contain references to the objects, not the objects themselves. All references are the same size, so the problem doesn't exist.

The internal type of the reference is surely object.

For generic arrays of value types, the actual value is stored in the array and the size of the element is used as you describe. If you put a value type in an ArrayList it will be boxed into an object and the reference to that object is stored in the ArrayList.

Anders Abel
  • 67,989
  • 17
  • 150
  • 217
2

For arrays of struct, the size of each element is known.

For arrays of reference types, the array stores the references ( pointers ) to the actual objects, which live in the heap.

The size of a pointer is known as well: 4 bytes on x86 and 8 bytes on x64.

So, the pointer arithmetic is always easy and fast.

In the case of ArrayList, the internal storage is a object[], so the implementation is not optimal for storing value types as they will be boxed and stored in the heap too.

Nick Butler
  • 24,045
  • 4
  • 49
  • 70
1

Okay, you asked what the implementation of an arrayList is, here it is : arraylist.cs

Straight from Microsoft no less. This is the Roslyn implementation.

Alexander Ryan Baggett
  • 2,347
  • 4
  • 34
  • 61