3

I have a sorted array (unique values, not duplicated).

I know I can use Array#binarysearch but it's used to find values not delete them. Can I delete a value at O(log n) as well? How?

Lets say I have this array:

arr = [-3, 4, 7, 12, 15, 20] #very long array

And I would like to delete the value 7. So far I have this:

arr.delete(7) #I'm quite sure it's O(n)

Assuming Array#delete-at works at O(1). I could do arr.delete_at(value_index) Now I just need to get the value's index. binary search can do it, since the array is already sorted. But the only method utilizing the sorted attribute (that i know of) is binary search which returns values, nothing about deleting or returning indexes.

To sum it up:

1) How to delete a value from sorted not duplicated array at O(log n) ?

Or

2) Assuming array#delete-at works at O(1) (does it?), how can I get the value's index at O(log n)? ( I mean the array is already sorted, must I implement it myself?)

Thank you.

Simone Carletti
  • 173,507
  • 49
  • 363
  • 364
Roko
  • 1,233
  • 1
  • 11
  • 22
  • 1
    Bear in mind that using the `delete-at` combined with `index` might be faster than implementing (in Ruby) your own `binarySearchReturningIndex` method. See [this answer](http://stackoverflow.com/a/7436419/3923525), `index` is implemented in C – jmm Feb 13 '15 at 23:15
  • The aspect of c-implementation vs ruby implementation wise never even crossed my mind. Since array#delete also implemented in c, maybe I should just use that. Thx!! From now on I will look for methods implemented at c for better performance as well :) – Roko Feb 13 '15 at 23:23

1 Answers1

5

The standard Array implementation has no constraint on sorting or duplicate. Therefore, the default implementation has to trade performance with flexibility.

Array#delete deletes an element in O(n). Here's the C implementation. Notice the loop

for (i1 = i2 = 0; i1 < RARRAY_LEN(ary); i1++) {
  ...
}

The cost is justified by the fact Ruby has to scan all the items matching given value (note delete deletes all the entries matching a value, not just the first), then shift the next items to compact the array.

delete_at has the same cost. In fact, it deletes the element by given index, but then it uses memmove to shift the remaining entries one index less on the array.

Using a binary search will not change the cost. The search will cost you O(log n), but you will need to delete the element at given key. In the worst case, when the element is in position [0], the cost to shift all the other items in memory by 1 position will be O(n).

In all cases, the cost is O(n). This is not unexpected. The default array implementation in Ruby uses arrays. And that's because, as said before, there are no specific constraints that could be used to optimize operations. Easy iteration and manipulation of the collection is the priority.

Array, sorted array, list and sorted list: all these data structures are flexible, but you pay the cost in some specific operations.

Back to your question, if you care about performance and your array is sorted and unique, you can definitely take advantage of it. If your primary goal is finding and deleting items from your array, there are better data structures. For instance, you can create a custom class that stores your array internally using a d-heap where the delete() costs O(log[d,n]), same applies if you use a binomial heap.

Simone Carletti
  • 173,507
  • 49
  • 363
  • 364
  • If I'm using another data structure- wouldn't a binary search tree be a valid choice? – Roko Feb 13 '15 at 23:33
  • It depends on what you need to do and where you want performance. keep in mind that a find and delete in a BST still cost O(n) in the worst case, and O(log n) on average. It's a reasonable compromise, better than a simple scan, but the worst case (hence the general cost) is still O(n). – Simone Carletti Feb 13 '15 at 23:51