Summing "Large" Nat's

Question

Given:

scala> import shapeless.nat.
_0   _10   _12   _14   _16   _18   _2    _21   _3   _5   _7   _9      natOps            
_1   _11   _13   _15   _17   _19   _20   _22   _4   _6   _8   apply   toInt             

scala> import shapeless.ops.nat._
import shapeless.ops.nat._

After > 3 minutes, the following code has not compiled/run. Why's that?

scala> Sum[_22, _22]

Also, looking at the above REPL auto-completion, does _44 even exist in shapeless?

Looks suspiciously similar to [max size of tuples](http://stackoverflow.com/q/4152223/1240268) but can't see where they're defined in the code. — Andy Hayden, Feb 23 '17 at 03:39

Travis Brown · Accepted Answer · 2017-02-23T14:53:06.370

Why is it so slow?

Let's start with a smaller number. When you ask for Sum[_4, _4], the compiler is going to go looking for an instance, and it'll find these two methods:

implicit def sum1[B <: Nat]: Aux[_0, B, B] = new Sum[_0, B] { type Out = B }
implicit def sum2[A <: Nat, B <: Nat](implicit
  sum: Sum[A, Succ[B]]
): Aux[Succ[A], B, sum.Out] = new Sum[Succ[A], B] { type Out = sum.Out }

The first one is clearly out since _4 is not _0. It knows that _4 is the same as Succ[_3] (more on that in a second), so it'll try sum2 with A as _3 and B as _4.

This means we need to find a Sum[_3, _5] instance. sum1 is out for similar reasons as before, so we try sum2 again, this time with A = _2 and B = _5, which means we need a Sum[_2, _6], which gets us back to sum2, with A = _1 and B = _6, which sends us looking for a Sum[_1, _7]. This is the last time we'll use sum2, with A = _0 and B = _7. This time when we go looking for a Sum[_0, _8] we'll hit sum1 and we're done.

So it's clear that for n + n we're going to have to do n + 1 implicit searches, and during each one the compiler is going to be doing type equality checks and other stuff (update: see Miles's answer for an explanation of what the biggest problem here is) that requires traversing the structure of the Nat types, so we're in exponential land. The compiler really, really isn't designed to work efficiently with types like this, which means that even for small numbers, this operation is going to take a long time.

Side note 1: implementation in Shapeless

Off the top of my head I'm not entirely sure why sum2 isn't defined like this:

implicit def sum2[A <: Nat, B <: Nat](implicit
  sum: Sum[A, B]
): Aux[Succ[A], B, Succ[sum.Out]] = new Sum[Succ[A], B] { type Out = Succ[sum.Out] }

This is much faster, at least on my machine, where Sum[_18, _18] compiles in four seconds as opposed to seven minutes and counting.

Side note 2: induction heuristics

This doesn't seem to be a case where Typelevel Scala's -Yinduction-heuristics helps—I just tried compiling Shapeless with the @inductive annotation on Sum and it's still seems pretty much exactly as horribly slow as without it.

What about 44?

The _1, _2, _3 type aliases are defined in code produced by this boilerplate generator in Shapeless, which is configured only to produce values up to 22. In this case specifically, this is an entirely arbitrary limit. We can write the following, for example:

type _23 = Succ[_22]

And we've done exactly the same thing the code generator does, but going one step further.

It doesn't really matter much that Shapeless's _N aliases stop at 22, though, since they're just aliases. The important thing about a Nat is its structure, and that's independent of any nice names we might have for it. Even if Shapeless didn't provide any _N aliases at all, we could still write code like this:

import shapeless.Succ, shapeless.nat._0, shapeless.ops.nat.Sum

Sum[Succ[Succ[_0]], Succ[Succ[_0]]]

And it would be exactly the same as writing Sum[_2, _2], except that it's a lot more annoying to type.

So when you write Sum[_22, _22] the compiler isn't going to have any trouble representing the result type (i.e. 44 Succs around a _0), even though it doesn't have a _44 type alias.

Re: Side Note 1, is it worthwhile to make a PR against shapeless for re-defining `sum2`? — Kevin Meredith, Feb 24 '17 at 14:43

Miles Sabin · Answer 2 · 2017-02-24T14:45:36.330

Following on from Travis's excellent answer, it appears that it's the use of the member type in the definition of sum2 which is the root of the problem. With the following definition of Sum and its instances,

trait Sum[A <: Nat, B <: Nat] extends Serializable { type Out <: Nat }

object Sum {
  def apply[A <: Nat, B <: Nat](implicit sum: Sum[A, B]): Aux[A, B, sum.Out] = sum

  type Aux[A <: Nat, B <: Nat, C <: Nat] = Sum[A, B] { type Out = C }

  implicit def sum1[B <: Nat]: Aux[_0, B, B] = new Sum[_0, B] { type Out = B }
  implicit def sum2[A <: Nat, B <: Nat, C <: Nat]
    (implicit sum : Sum.Aux[A, Succ[B], C]): Aux[Succ[A], B, C] =
      new Sum[Succ[A], B] { type Out = C }
}

which replaces the use of the member type with an additional type variable, the compile time is 0+noise on my machine both with and without -Yinduction-heurisitics.

I think that the issue we're seeing is a pathological case for subtyping with member types.

Aside from that, the induction is so small that I wouldn't actually expect -Yinduction-heurisitics to make much of an improvement.

Update now fixed in shapeless.

Summing "Large" Nat's

2 Answers2

Why is it so slow?

Side note 1: implementation in Shapeless

Side note 2: induction heuristics

What about 44?