3

I have written this algorithm here and I am trying to evaluate its time and space complexity in terms of Big-O notation. The algorithm determines if two given strings are anagrams.

def anagram(str1, str2)
 str1.each_char do |char| 
   selected_index = str2.index(char)
   return false if !selected_index #to handle nil index

   str2.slice!(selected_index)
 end

 str2.empty?
end

The time complexity of this function is O(n^2), and the space complexity is O(1)? I believe I may be mistaken for the space complexity (could be O(n)) because the selected_index variable is repeatedly re-assigned which takes up memory relative to how long the each_char loop runs for.

If someone could please throw some guidance that would be great :)

Henry
  • 73
  • 5
  • Does `str2.index` search through `str2`? If so, I think you're looking at time `O(n^2)` here... In fact, `slice` is almost certainly `O(n)` itself, so yes, this is time `O(n^2)`. – joanis Aug 21 '19 at 20:49
  • My apologies, it is quadratic as you mentioned... – Henry Aug 21 '19 at 20:50
  • 1
    Space is at least `O(n)`, since you're modifying a copy of `str2` with the call to `slice`. – joanis Aug 21 '19 at 20:51
  • I can think of an `O(n log n)` time algorithm for this, also with `O(n)` space, do you can about making faster or just doing the Big-Oh analysis of this code? – joanis Aug 21 '19 at 20:53
  • I have made it faster, using a hash table data structure into which I insert the counts of each letter within a loop. That gives me O(n) time and O(1) space. – Henry Aug 21 '19 at 20:55
  • Yeah, the hash table is even faster than what I thought (which was basically equality of sorting the two strings by character), but I disagree with `O(1)` space: the hash table will require `O(n)` space. – joanis Aug 21 '19 at 20:58
  • Does the re-assignment of ```selected_index``` contribute significantly to the space complexity? Or that will mainly come from the ```.slice``` call to the object? – Henry Aug 21 '19 at 20:59
  • I thought the hash table was ```O(n)``` at a first glance after implementation, but regardless of the input size there will always be a maximum of 26 characters. So it is constant. – Henry Aug 21 '19 at 21:01
  • 1
    I would have thought `selected_index =` would take `O(1)` space per call, I don't see it being a problem. – joanis Aug 21 '19 at 21:01
  • Agreed, if you consider a limited alphabet, it's constant space. – joanis Aug 21 '19 at 21:01
  • Many thanks for your help! – Henry Aug 21 '19 at 21:09
  • It doesn't make sense to talk about algorithmic complexity without specifying your machine model and cost model. What machine model are you using? What are the operations you are counting? – Jörg W Mittag Aug 22 '19 at 09:47

1 Answers1

2

Gathering up all those comments into an answer, here is my analysis:

Time

The algorithm as presented does indeed have O(n^2) running time.

The body of the loop is executed n times and takes linear time for index, linear time for slice, and constant time for the rest, requiring a total of O(n^2) time.

Space

The algoithm as presented requires linear space, because it updates a copy of str2 at each iteration.

The rest of the algorithm only takes constant space, unless you include the storage for the inputs themselves, which is also linear.

Faster algorithm: sort str1 and str2

A faster algorithm would be to do string compare sort-by-character(str1) and sort-by-character(str2). That would take O(n log n) time and O(n) space for the sort; and linear time and constant space for the comparison, for an overall O(n log n) time and O(n) space.

Even faster algorithm: use a hash (proposed by OP in the comments)

Using hash tables to store character and then compare character counts can reduce the running time to O(n), assuming standard O(1) insert and lookup hash operations. The space in this case is the space required for the hash tables, which is O(k) for a character alphabet of size k, which can be considered constant if k is fixed. Of course, the input parameters still consume their initial O(n) space as they are passed in or where they are originally stored; the O(k) reflects only the additional space required to run this algorithm.

joanis
  • 10,635
  • 14
  • 30
  • 40