0

I have a list that I divided into a fixed number of sections (with some that might be empty).
Every section contains unordered elements, however the sections themselves are ordered backwards.
I am referencing each beginning of a section through a fixed dimension array, whose elements are the indexes at which the each section can be found in the list.
I regularly extract the whole section at the tail of the list and, when I do so, I set its index inside the array at 0 (so the section will start to regrow from the head of the list) and then I circularly increment the lastSection variable that I use to keep track of which section is at the tail of the list.
With the same frequency I also need to insert back into the list some new elements that will be spread across one or more sections.
I chose a single sectioned list (instead of a list of lists or something like that) because, even if the sections may vary a lot (from empty to a length of some thousands), the total number of elements has little variations during the application runtime AND because I also frequently need to get all the elements in the list, and didn't want to concatenate multiple lists in order to get the result.

Graphical representation of the data structure

Existential question:
Up to here did I do some mistakes in the choice of the data structure, since these described are all the operations I am doing with it?

Going forward:
The problem I am trying to address, since this is the core of the application I am building (and I want to squeeze out every slice of performance I can since it should run on smartphones), is: how can I do those multiple inserts as fast as possible?

Trivial solution:
For each new group of elements belonging to a certain section, just do an insertRange (sectionBeginning, groupOfElements).
Performance footprint:
every insertRange will force the list to shift all the content after the root of a section to the right, and with multiple insertRange this means that some data will be shifted even M times, where M is the number of insertRange done with index != list.Count.

Little smarter solution:
Knowing before every multiple-inserts step which and how many new elements per section I need to add, I can add empty elements to the back of the list, perform M shifts of determined size, then copy the new elements to the corresponding "holes" left inside the list.
I could extend the list class and implement a new insertRange (int [] index, IEnumerable [] collection) where each index points to the beginning of a section, however I am worried about some possible internal optimizations that the list class might have and that could transform my for loop shifts in worse performance, like an Array.Copy to which I do not think to have access. Is there a way to do a performant list shift in order to implement this and gain some advantages over multiple standard insertRanges?
Note: index and collections should be ordered by section.

Graphical representation of the multiple-at once insertRange approach

Another similar thread about insertRange:
Replace multiple InsertRange() into efficient way

Another similar thread about shifts in lists:
Does code exist, for shifting List elements to left or right by specified amount, in C#?

Community
  • 1
  • 1
  • Why not have a List of Lists and just have each inner list a section rather than throwing them all together and manually managing their locations? – Broots Waymb Apr 26 '17 at 13:37
  • Also, what exactly are you after here? Are you simply asking how to make inserts faster? What leads you to believe you need to improve what you have? How much data are you working with? Unless it's a fairly large set, inserting should be nearly instant. You should simplify your question down to a single thing you're after and be a little more direct. – Broots Waymb Apr 26 '17 at 13:41
  • "Why not have a List of Lists" I talked about it a little in the post. Advantages of the list of lists approach: -simply add to the end Disadvantages: -sections can go to thousands of elements to no elements, but since Lists once have gained capacity don't shrink down, I would be stuck with every list occupying a lot of memory even when not in use, while the single sectioned list allow to contain memory footprint -when I need a list containing all the elements, I would need to concat them all – Tommaso Bonvicini Apr 26 '17 at 13:54
  • Essentially I want to avoid using SortedList: while it might works well as a solution for my problem (since every section is time-based and I could simply order by time), that would be really overkill since I don't need ordering, but just an approximation of time periods. The thing is: how to do shifts in lists the more performant way possible – Tommaso Bonvicini Apr 26 '17 at 13:56
  • Have you actually run this on a smart phone to see if what you currently have is a performance problem? Unless you're doing this hundreds or thousands of times per second, a simple list of lists is going to be plenty fast enough. And you can avoid the capacity problem by calling [TrimExcess](https://msdn.microsoft.com/en-us/library/ms132207(v=vs.110).aspx) from time to time. Or just set `Capacity = Count` if you think memory is really tight. – Jim Mischel Apr 26 '17 at 17:56
  • The idea would be to have a thousand of these data structures running in parallel with a total amount of 50 million elements going out from one into others. In going to test this soon, but I would have liked to be already on the right track. – Tommaso Bonvicini Apr 27 '17 at 07:00
  • There's no particular reason your multiple shifts idea won't work, and it could be faster than using the standard `InsertRange`. The expensive part will be re-allocating the list if it needs to grow. I'd still suggest that you get something up and running, *working*, using the standard `InsertRange` before you spend too much time thinking about optimization. – Jim Mischel Apr 27 '17 at 14:41

0 Answers0