A few days ago, I was given the following interview question. It was described with Standard ML code, but I was free to answer with the language of my choice (I picked Python):
I have a type:
datatype t = Leaf of int | Node of (t * t)
and a function,
f
with the signatureval f: int -> t
You need to write a function
equals
that checks whether two trees are equal.f
isO(n)
, and it does "the worst possible thing" for the time complexity of yourequals
function. Writeequals
such that it is never exponential onn
, the argument tof
.
The example of f
that was provided was:
fun f n =
if n = 0 then
Leaf(0)
else
let
val subtree = f (n - 1)
in
Node (subtree, subtree)
end
which produces an exponentially large tree in O(n)
time, so equals (f(n), f(n))
for the naive equals
implementation that's linear on the number of nodes of the tree is O(2^n)
.
I produced something like this:
class Node:
def __init__(self, left, right):
self.left = left
self.right = right
class Leaf:
def __init__(self, value):
self.value = value
def equals(left, right):
if left is right:
return True
try:
return left.value == right.value
except ValueError:
pass
try:
return equals(left.left, right.left) and equals(left.right, right.right)
except ValueError:
return False
which worked on the example of f
that the interviewer provided, but failed in the general case of "f
does the worst thing possible." He provided an example that I don't remember that broke my first attempt. I flubbed around for a bit and eventually made something that looked like this:
cache = {}
def equals(left, right):
try:
return cache[(left, right)]
except KeyError:
pass
result = False
try:
result = left.value == right.value
except ValueError:
pass
try:
left_result = equals(left.left, right.left)
right_result = equals(left.right, right.right)
cache[(left.left, right.left)] = left_result
cache[(left.right, right.right)] = right_result
result = left_result and right_result
except ValueError:
pass
cache[(left, right)] = result
return result
but I felt like that was an awkward hack and it clearly wasn't what the interviewer was looking for. I suspect that there's an elegant way to avoid recomputing subtrees -- what is it?