2

I'm new to trying to use the IEnumerable interfaces. I've always just written custom hash sorting rather than trying to use native syntax because I was somewhat confused by the implementation. I'm trying to determine if I can assemble a List in sorted format by using BinarySearch or some similar function. Is there a function that will return the nearest possible index for insertion of a new item to a List so the list always remains sorted via a hash tree every time you insert an item?

When I use BinarySearch it seems to always return -1 if there's no match. I would rather it return the nearest possible index. Is there a way to do this with the native IEnumerable interfaces? I'd rather not call "Sort(IComparer)" every time I want to reference the List.

In short: Can BinarySearch or some equivalent function be used when adding a new item to a List to find the best index to "Insert(Item, index)" into the list?

ThisHandleNotInUse
  • 1,135
  • 1
  • 10
  • 23

2 Answers2

4

Consider using SortedSet<T>, which has built-in support for keeping the list sorted. This has the advantage of guaranteeing that the contents will always be in sorted order, rather than relying on your code to properly respect the sort order whenever the list is modified.

Dan Bryant
  • 27,329
  • 4
  • 56
  • 102
  • Thank you, I was only aware of "SortedList" which wasn't what I wanted. I knew there had to be a class somewhere that did this. – ThisHandleNotInUse Jan 13 '15 at 23:06
  • Hmmm... MSDN says "SortedSet" uses a linear search pattern... doesn't this defeat the purpose of using the IComparer? Or is there a reason it must be linear that evades my reasoning regarding the IComparer interface? Right now, all the "IComparer" like classes feel very "black box" to me - I imagine you know what I mean... – ThisHandleNotInUse Jan 13 '15 at 23:12
  • IComparer just returns a single comparison result for a pair of values; it doesn't provide any other information, like hashing. Hashing is provided by the [Object.GetHashCode](http://msdn.microsoft.com/en-us/library/system.object.gethashcode%28v=vs.110%29.aspx) method, at least when the deriving class implements it. This is how classes like `Dictionary` generate their hashes – Dan Bryant Jan 13 '15 at 23:27
  • I just answered my own question (blush) by looking more carefully at the documentation for "BinarySearch" - apparently it returns the _bitwise compliment_ of the next available position - the compliment of -1 is "0" so what I thought was a "-1" representing "not found" is actually a -1 representing "insert at 0." Sorry for wasting your time... Your suggestion was good though but I'm going to answer my own question. SortedSet just looked too slow to me. – ThisHandleNotInUse Jan 13 '15 at 23:31
  • @ThisHandleNotInUse, where do you see the reference that insertion is linear? I opened up the code in dotPeek and it looks like SortedSet is implemented using a red-black tree behind the scenes, which has `O(log N)` insertion. – Dan Bryant Jan 13 '15 at 23:37
  • The insertion appears to be that... but according to this, "Search" on SortedSet is O(n) - unless it means searching it for something other than what it is ordered by: http://www.c-sharpcorner.com/UploadFile/0f68f2/comparative-analysis-of-list-hashset-and-sortedset/ – ThisHandleNotInUse Jan 13 '15 at 23:49
  • Sorry, I'm not a computer scientist, so sometimes I struggle to communicate with the precision I want due to uncertain grasp of some underlying concepts. – ThisHandleNotInUse Jan 13 '15 at 23:51
  • I just discovered that you can't enumerate over a SortedSet via index which is something I need to be able to do as well as have it sorted for this application - however, I will keep it in mind for future uses as it seems an interesting class... – ThisHandleNotInUse Jan 14 '15 at 00:02
  • I checked the code and the `Contains` call (which is the closest I can think of for the purposes of searching) is `O(log N)`, as it uses a binary search. You're correct that indexing by integer is inefficient, though. If you're really worried about performance, I recommend doing some testing with a good code profiler. – Dan Bryant Jan 14 '15 at 00:09
  • What I'm doing is constructing a class that utilizes a number of "blocks" of byte arrays and treats them like they're one complete array allowing easy buffering of edits to different blocks. I need to be able to add and remove from the ends of the "List of Arrays" due to the fact that is acts like a buffer and enumerate through them easily in some situations which you can't do with SortedSet. I will come back and try to examine the nature of a "SortedSet" more carefully at some point when my head is a little clearer than it is now. – ThisHandleNotInUse Jan 14 '15 at 00:18
  • I'm not really "worried about performance" so much as I'd rather just grasp the most efficient way to do what I want to do in the first place... I can always use a "duct tape" method of programming, but I prefer to be precise at all times. – ThisHandleNotInUse Jan 14 '15 at 00:19
  • I see the problem now... Insert is incredibly inefficient so I might have to investigate SortedSet more... – ThisHandleNotInUse Jan 14 '15 at 02:07
0

In case anyone comes across this - the first solution is to always carefully read the documentation before asking questions.

BinarySearch(IComparer) returns the bitwise compliment of the next best position if no match is found, so it fulfills the needs of "binary tree sort" a List while populating it. When I saw it return "-1" I falsely concluded that meant "not found" due to the fact that I frequently use "-1" for "not found" when the integer should otherwise be positive.

ThisHandleNotInUse
  • 1,135
  • 1
  • 10
  • 23