8

I have a List container which can potentially have up to 100,000 items in to start with. While the program is running this list will slowly empty, should I alter the capacity as I empty the list?

I have done some testing and execution time seems to be the same, but is there much overhead to lowering the capacity of a list? I can find lots of information about increasing the capacity but not much on lowering it.

Eddie
  • 690
  • 10
  • 27
  • 6
    Why do you want to do this? Is your application slow? – TJHeuvel Aug 23 '11 at 14:56
  • 2
    Don't solve problems you don't have. – H H Aug 23 '11 at 15:00
  • It generates large quantities of test data for a complex system. Currently takes about 5 minutes for 100,000 entries. I'm just hunting for little ways to speed it up. – Eddie Aug 23 '11 at 15:02
  • @Eddie: Simple iteration over 100k elements shouldn't take 5 minutes. Do you know where the time is being consumed? – recursive Aug 23 '11 at 15:13
  • @Eddie: If it takes 5 minutes for 100,000 entries, speeding up the processing of each entry should be your focus, the actual process of cycling through 100,000 entries in a list is trivial. – James Michael Hare Aug 23 '11 at 15:15
  • @recusrive: hah, same thought :-) – James Michael Hare Aug 23 '11 at 15:15
  • Sorry I should have been more clear. This process is only a very small aspect of the entire program. This will contain a list of 100,000 unique id's which are assigned to test data as the program creates it. – Eddie Aug 23 '11 at 15:18
  • @Eddit: Just out of curiosity, are you pulling items out of the front of the list, from the back of the list, or somewhere in the middle? – StriplingWarrior Aug 23 '11 at 15:24

3 Answers3

10

Unless you have a very low amount of memory, this is a micro-optimization.

Normally, there is no need to change the capacity of a List<>.

From the TrimExcess method documentation:

This method can be used to minimize a collection's memory overhead if no new elements will be added to the collection. The cost of reallocating and copying a large List<T> can be considerable, however, so the TrimExcess method does nothing if the list is at more than 90 percent of capacity. This avoids incurring a large reallocation cost for a relatively small gain.

Oded
  • 489,969
  • 99
  • 883
  • 1,009
  • So if I calculate when it drops below 50/60% capacity and use that method then? That would be effective? – Eddie Aug 23 '11 at 15:04
  • 3
    @Eddie - it might be. Or not. It depends - you will need to test and see. – Oded Aug 23 '11 at 15:06
  • @Eddie: if you are TIME constrained, reducing the list size will do little. Keep in mind that 100,000 items (assuming class?) will be probably less than 1 MB in internal array space. As you remove each object from the list, it will be eligible for collection once you are done with it, so only the list internal storage is an issue, and as long as its primitives or reference types, this really isn't much of an issue. – James Michael Hare Aug 23 '11 at 15:18
  • @Eddie: Now, if you WERE truly memory constrained, you COULD consider a LinkedList which is fully dynamic size, but it is slower to iterate over... – James Michael Hare Aug 23 '11 at 15:18
3

Do the math: 100,000 items * 4 bytes per item = roughtly 400KB. If that's too much memory overhead for your program, you can call TrimExcess, as Oded points out recreate smaller lists once it gets smaller. (I'm not sure that reducing the capacity will actually have the effect you're going for.)

StriplingWarrior
  • 151,543
  • 27
  • 246
  • 315
2

Lowering the capacity of a list involves a new backing array and copying the data across, so it's a relatively expensive operation.

In your particular case I would say it is not worth it unless you start hitting memory problems.

One strategy that can be employed if it were to become a real problem is to create a 'chunked' implementation of IList<> which uses not one array, but multiple, each of preconfigured size , with additional chunks (fixed size arrays) added as the previous fills up. This also allows the list to shrink relatively inexpensively by releasing unused chunks as items are removed, whilst minimizing the memory overhead to just one non-full chunk (the last).

This approach adds a performance overhead to all operations on the list though, as the list has to calculate which chunk an items resides and create new chunks as required. So it is not useful unless you truly have a memory problem and a list that truly changes size dramatically over time.

Paul Ruane
  • 37,459
  • 12
  • 63
  • 82