18

Since I liked programming in Scala, for my Google interview, I asked them to give me a Scala / functional programming style question. The Scala functional style question that I got was as follows:

You have two strings consisting of alphabetic characters as well as a special character representing the backspace symbol. Let's call this backspace character '/'. When you get to the keyboard, you type this sequence of characters, including the backspace/delete character. The solution you are to implement must check if the two sequences of characters produce the same output. For example, "abc", "aa/bc". "abb/c", "abcc/", "/abc", and "//abc" all produce the same output, "abc". Because this is a Scala / functional programming question, you must implement your solution in idiomatic Scala style.

I wrote the following code (it might not be exactly what I wrote, I'm just going off memory). Basically I just go linearly through the string, prepending characters to a list, and then I compare the lists.

def processString(string: String): List[Char] = {
  string.foldLeft(List[Char]()){ case(accumulator: List[Char], char: Char) =>
    accumulator match {
      case head :: tail => if(char != '/') { char :: head :: tail } else { tail }
      case emptyList => if(char != '/') { char :: emptyList } else { emptyList }
    }
  }
}

def solution(string1: String, string2: String): Boolean = {
  processString(string1) == processString(string2)
}

So far so good? He then asked for the time complexity and I responded linear time (because you have to process each character once) and linear space (because you have to copy each element into a list). Then he asked me to do it in linear time, but with constant space. I couldn't think of a way to do it that was purely functional. He said to try using a function in the Scala collections library like "zip" or "map" (I explicitly remember him saying the word "zip").

Here's the thing. I think that it's physically impossible to do it in constant space without having any mutable state or side effects. Like I think that he messed up the question. What do you think?

Can you solve it in linear time, but with constant space?

Jatin
  • 31,116
  • 15
  • 98
  • 163
Michael Lafayette
  • 2,972
  • 3
  • 20
  • 54
  • Since `immutability` is one of the corner stones of functional programming, you are bound to create copies of the string/list/array which means that doing any of this in constant space does not seem feasible. And as far as `zip` is concerned, any usage of it is will mean linear space by definition. – sarveshseri Sep 09 '18 at 07:45
  • Regardless of what you do, you will need to construct the output somehow. The output has size O(n). The only way I can see how to recover O(1) space would be if you could drop parts of the input at the same time you construct parts of the output. But `String`s don't *have* parts to drop, so it seems impossible. It would be trivial, for example, if the input were a reverse list of characters. Other ideas, like re-using parts of the input string, have the same problem: strings don't have parts, so you would have to keep a list of slice indices, but that list takes O(n) space. – Jörg W Mittag Sep 09 '18 at 07:52
  • Even if we forget the String immutability and consider `Array[Char]`, we will need to mutate those arrays to do things in constant space. And mutability does not apply to functional programs. – sarveshseri Sep 09 '18 at 10:31
  • 1
    Meta-comment: can you really ask an interviewer to get a question about a particular topic? I find this surprising. – Dici Sep 09 '18 at 11:20
  • 1
    @SarveshKumarSingh if you use Scala views, you can zip two collections in `O(1)` **additional** space – Dici Sep 09 '18 at 13:29
  • 1
    @JörgWMittag The `solution` has to *compare* the two string descriptions, it does not have to normalize them. The output is a `Boolean`, which takes constant space. – Andrey Tyukin Sep 09 '18 at 14:26
  • @AndreyTyukin: Oh, how stupid! You are right, I was focusing too much on the OP's code instead of reading the problem statement. – Jörg W Mittag Sep 09 '18 at 15:09
  • Its a public forum. There is no grace in using the `f` word! – Jatin Sep 09 '18 at 15:56
  • @Dici Process of zipping can be done in `O(1)` space, but the new "zipped" collection is going to take linear space. – sarveshseri Sep 09 '18 at 18:00
  • @SarveshKumarSingh you don't necessarily have to materialize the collection. You could make a reduce, trying to find at least one element matching a predicate, iterating over all elements, all of which wouldn't require to store the result of the zipping in memory. That's what I meant. – Dici Sep 09 '18 at 18:15

4 Answers4

6

This code takes O(N) time and needs only three integers of extra space:

def solution(a: String, b: String): Boolean = {

  def findNext(str: String, pos: Int): Int = {
    @annotation.tailrec
    def rec(pos: Int, backspaces: Int): Int = {
      if (pos == 0) -1
      else {
        val c = str(pos - 1)
        if (c == '/') rec(pos - 1, backspaces + 1)
        else if (backspaces > 0) rec(pos - 1, backspaces - 1)
        else pos - 1
      }
    }
    rec(pos, 0)
  }

  @annotation.tailrec 
  def rec(aPos: Int, bPos: Int): Boolean = {
    val ap = findNext(a, aPos)
    val bp = findNext(b, bPos)
    (ap < 0 && bp < 0) ||
    (ap >= 0 && bp >= 0 && (a(ap) == b(bp)) && rec(ap, bp))
  }

  rec(a.size, b.size)
}

The problem can be solved in linear time with constant extra space: if you scan from right to left, then you can be sure that the /-symbols to the left of the current position cannot influence the already processed symbols (to the right of the current position) in any way, so there is no need to store them. At every point, you need to know only two things:

  1. Where are you in the string?
  2. How many symbols do you have to throw away because of the backspaces

That makes two integers for storing the positions, and one additional integer for temporary storing the number of accumulated backspaces during the findNext invocation. That's a total of three integers of space overhead.

Intuition

Here is my attempt to formulate why the right-to-left scan gives you a O(1) algorithm:

The future cannot influence the past, therefore there is no need to remember the future.

The "natural time" in this problem flows from left to right. Therefore, if you scan from right to left, you are moving "from the future into the past", and therefore you don't need to remember the characters to the right of your current position.

Tests

Here is a randomized test, which makes me pretty sure that the solution is actually correct:

val rng = new util.Random(0)
def insertBackspaces(s: String): String = {
  val n = s.size
  val insPos = rng.nextInt(n)
  val (pref, suff) = s.splitAt(insPos)
  val c = ('a' + rng.nextInt(26)).toChar
  pref + c + "/" + suff
}

def prependBackspaces(s: String): String = {
  "/" * rng.nextInt(4) + s
}

def addBackspaces(s: String): String = {
  var res = s
  for (i <- 0 until 8) 
    res = insertBackspaces(res)
  prependBackspaces(res)
}

for (i <- 1 until 1000) {
  val s = "hello, world"
  val t = "another string"

  val s1 = addBackspaces(s)
  val s2 = addBackspaces(s)
  val t1 = addBackspaces(t)
  val t2 = addBackspaces(t)

  assert(solution(s1, s2))
  assert(solution(t1, t2))
  assert(!solution(s1, t1))
  assert(!solution(s1, t2))
  assert(!solution(s2, t1))
  assert(!solution(s2, t2))

  if (i % 100 == 0) {
    println(s"Examples:\n$s1\n$s2\n$t1\n$t2")
  }
}

A few examples that the test generates:

Examples:
/helly/t/oj/m/, wd/oi/g/x/rld
///e/helx/lc/rg//f/o, wosq//rld
/anotl/p/hhm//ere/t/ strih/nc/g
anotx/hb/er sw/p/tw/l/rip/j/ng
Examples:
//o/a/hellom/, i/wh/oe/q/b/rld
///hpj//est//ldb//y/lok/, world
///q/gd/h//anothi/k/eq/rk/ string
///ac/notherli// stri/ig//ina/n/g
Examples:
//hnn//ello, t/wl/oxnh///o/rld
//helfo//u/le/o, wna//ova//rld
//anolq/l//twl//her n/strinhx//g
/anol/tj/hq/er swi//trrq//d/ing
Examples:
//hy/epe//lx/lo, wr/v/t/orlc/d
f/hk/elv/jj//lz/o,wr// world
/anoto/ho/mfh///eg/r strinbm//g
///ap/b/notk/l/her sm/tq/w/rio/ng
Examples:
///hsm/y//eu/llof/n/, worlq/j/d
///gx//helf/i/lo, wt/g/orn/lq/d
///az/e/notm/hkh//er sm/tb/rio/ng
//b/aen//nother v/sthg/m//riv/ng

Seems to work just fine. So, I'd say that the Google-guy did not mess up, looks like a perfectly valid question.

Andrey Tyukin
  • 43,673
  • 4
  • 57
  • 93
  • 1
    Mmm, it is more concise indeed. I think pattern matching is easier on the eye though. your solution would also be a little faster since it doesn't have to create objects while iterating, but I wasn't focusing on speed too much, I could have removed the wrapper type at the expense of some readability (well from my point of view). – Dici Sep 09 '18 at 15:38
  • 1
    @Dici Removed half of the noise caused by unnecessary `aBsp` and `bBsp` variables: between the invocation of `findNext`, they are guaranteed to be zero anyway. Now it's even shorter, and needs only three integers overhead. – Andrey Tyukin Sep 09 '18 at 18:39
5

You don't have to create the output to find the answer. You can iterate the two sequences at the same time and stop on the first difference. If you find no difference and both sequences terminate at the same time, they're equal, otherwise they're different.

But now consider sequences such as this one: aaaa/// to compare with a. You need to consume 6 elements from the left sequence and one element from the right sequence before you can assert that they're equal. That means that you would need to keep at least 5 elements in memory until you can verify that they're all deleted. But what if you iterated elements from the end? You would then just need to count the number of backspaces and then just ignoring as many elements as necessary in the left sequence without requiring to keep them in memory since you know they won't be present in the final output. You can achieve O(1) memory using these two tips.

I tried it and it seems to work:

def areEqual(s1: String, s2: String) = {
    def charAt(s: String, index: Int) = if (index < 0) '#' else s(index)

    @tailrec
    def recSol(i1: Int, backspaces1: Int, i2: Int, backspaces2: Int): Boolean = (charAt(s1, i1), charAt(s2, i2)) match {
        case ('/',  _) => recSol(i1 - 1, backspaces1 + 1, i2, backspaces2)
        case (_,  '/') => recSol(i1, backspaces1, i2 - 1, backspaces2 + 1)
        case ('#' , '#') => true
        case (ch1, ch2)  => 
            if      (backspaces1 > 0) recSol(i1 - 1, backspaces1 - 1, i2    , backspaces2    )
            else if (backspaces2 > 0) recSol(i1    , backspaces1    , i2 - 1, backspaces2 - 1)
            else        ch1 == ch2 && recSol(i1 - 1, backspaces1    , i2 - 1, backspaces2    )
    }
    recSol(s1.length - 1, 0, s2.length - 1, 0)
}

Some tests (all pass, let me know if you have more edge cases in mind):

// examples from the question
val inputs = Array("abc", "aa/bc", "abb/c", "abcc/", "/abc", "//abc")
for (i <- 0 until inputs.length; j <- 0 until inputs.length) {
    assert(areEqual(inputs(i), inputs(j)))
}

// more deletions than required
assert(areEqual("a///////b/c/d/e/b/b", "b")) 
assert(areEqual("aa/a/a//a//a///b", "b"))
assert(areEqual("a/aa///a/b", "b"))

// not enough deletions
assert(!areEqual("aa/a/a//a//ab", "b")) 

// too many deletions
assert(!areEqual("a", "a/"))

PS: just a few notes on the code itself:

  • Scala type inference is good enough so that you can drop types in the partial function inside your foldLeft
  • Nil is the idiomatic way to refer to the empty list case

Bonus:

I had something like Tim's soltion in mind before implementing my idea, but I started early with pattern matching on characters only and it didn't fit well because some cases require the number of backspaces. In the end, I think a neater way to write it is a mix of pattern matching and if conditions. Below is my longer original solution, the one I gave above was refactored laater:

def areEqual(s1: String, s2: String) = {
    @tailrec
    def recSol(c1: Cursor, c2: Cursor): Boolean = (c1.char, c2.char) match {
        case ('/',  '/') => recSol(c1.next, c2.next)
        case ('/' ,   _) => recSol(c1.next, c2     )
        case (_   , '/') => recSol(c1     , c2.next)
        case ('#' , '#') => true
        case (a   ,   b) if (a == b) => recSol(c1.next, c2.next)
        case _           => false
    }
    recSol(Cursor(s1, s1.length - 1), Cursor(s2, s2.length - 1))
}

private case class Cursor(s: String, index: Int) {
    val char = if (index < 0) '#' else s(index)
    def next = {
      @tailrec
      def recSol(index: Int, backspaces: Int): Cursor = {
          if      (index < 0      ) Cursor(s, index)
          else if (s(index) == '/') recSol(index - 1, backspaces + 1)
          else if (backspaces  > 1) recSol(index - 1, backspaces - 1)
          else                      Cursor(s, index - 1)
      }
      recSol(index, 0)
    }
}
Dici
  • 25,226
  • 7
  • 41
  • 82
  • What's the `#` thing good for? The problem description didn't say anything about `#`. – Andrey Tyukin Sep 09 '18 at 12:57
  • It's a lame way I have found to represent the end of the sequence. It helped me write everything in terms of character comparisons and not worry about out of bound exceptions. An `Optional` might also do the trick but less concise. I used this character because the problem statement mentions that no other special character than the backspace will be present in the input sequences. – Dici Sep 09 '18 at 13:01
  • 2
    @AndreyTyukin: This trick is called a "sentinel". It is a way of removing exceptional logic from loops by turning the problem of "detecting the end of a sequence" into a problem of "processing an element". When you already have a case distinction for multiple different elements, it is more elegant to add another element than to add a completely different kind of condition. – Jörg W Mittag Sep 09 '18 at 15:12
  • @JörgWMittag didn't know this had a name :D I learnt something here – Dici Sep 09 '18 at 15:16
  • 1
    @JörgWMittag Yeah, I see that it was supposed to be a sentinel value, but why choose `'#'` for that? At least, one could take something like `\uFFFF`, or some other reserved symbol that is guaranteed to not occur in any piece of valid unicode text. And the `case ('#' , '#') => true` ends up in the code anyway. I'd argue that checking that `aPos < 0 && bPos <0` is both shorter and more robust (does not behave weirdly on input with hashtags). – Andrey Tyukin Sep 09 '18 at 18:42
  • 2
    @AndreyTyukin literally the first character I've seen on my keyboard that wasn't a letter or a slash. Good enough since the problem specifies that only alphanumeric characters and a single special character can be in the sequences. – Dici Sep 09 '18 at 18:44
  • A single `if/else` chain is clearer than a mix of `match` and `if/else`, and `Option` is more idiomatic Scala than a special termination character. But I am flattered that you decided to re-write your solution after I posted mine. – Tim Sep 11 '18 at 09:01
  • @Tim I'm fine with mixing pattern matching and if conditions. The fact I tried not to mix them is what made my original solution longer, because it's not very concise to mach things like "a positive number". I would generally agree about `Option` but I found it adds boilerplate in this case. I would need lots of `Some(...)` and only one `None` – Dici Sep 11 '18 at 11:22
4

If the goal is minimal memory footprint, it's hard to argue against iterators.

def areSame(a :String, b :String) :Boolean = {
  def getNext(ci :Iterator[Char], ignore :Int = 0) : Option[Char] =
    if (ci.hasNext) {
      val c = ci.next()
      if (c == '/')        getNext(ci, ignore+1)
      else if (ignore > 0) getNext(ci, ignore-1)
      else                 Some(c)
    } else None

  val ari = a.reverseIterator
  val bri = b.reverseIterator
  1 to a.length.max(b.length) forall(_ => getNext(ari) == getNext(bri))
}

On the other hand, when arguing FP principals it's hard to defend iterators, since they're all about maintaining state.

jwvh
  • 50,871
  • 7
  • 38
  • 64
2

Here is a version with a single recursive function and no additional classes or libraries. This is linear time and constant memory.

def compare(a: String, b: String): Boolean = {
  @tailrec
  def loop(aIndex: Int, aDeletes: Int, bIndex: Int, bDeletes: Int): Boolean = {
    val aVal = if (aIndex < 0) None else Some(a(aIndex))
    val bVal = if (bIndex < 0) None else Some(b(bIndex))

    if (aVal.contains('/')) {
      loop(aIndex - 1, aDeletes + 1, bIndex, bDeletes)
    } else if (aDeletes > 0) {
      loop(aIndex - 1, aDeletes - 1, bIndex, bDeletes)
    } else if (bVal.contains('/')) {
      loop(aIndex, 0, bIndex - 1, bDeletes + 1)
    } else if (bDeletes > 0) {
      loop(aIndex, 0, bIndex - 1, bDeletes - 1)
    } else {
      aVal == bVal && (aVal.isEmpty || loop(aIndex - 1, 0, bIndex - 1, 0))
    }
  }

  loop(a.length - 1, 0, b.length - 1, 0)
}
Tim
  • 26,753
  • 2
  • 16
  • 29