2

If I have a generic parameter that I am resolving via pattern matching to a primitive such as Int, is auto-boxing cheaper than using a custom wrapper type? E.g.

def test[A](x: A): Int = x match {
  case i: Int => i
  case _ => -1
}

versus

case class NumChannels(value: Int)

def test[A](x: A): Int = x match {
  case n: NumChannels => n.value
  case _ => -1
}

Does the first approach offer any performance benefits? Is this situation the same if the method was using Any instead:

def test(x: Any): Int = ...

?

0__
  • 66,707
  • 21
  • 171
  • 266
  • 1
    What were your profiling results and how were they surprising? –  Jul 17 '14 at 14:21
  • I was not profiling. I'm trying to make a decision which kind of API to settle on. – 0__ Jul 17 '14 at 14:46
  • Why do you not profile and make your decision based on the results, rather than relying on speculations? –  Jul 17 '14 at 14:59
  • No speculations—I want to know technically what the difference between the two versions is (if there is any). That's a question for people familiar with how autoboxing works in Scala. – 0__ Jul 17 '14 at 15:48

1 Answers1

2

If you are look at the output of javap (only the parts which differ):

  • The version using Int:
 10: invokestatic  #17                 // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
 13: istore_3      
 14: iload_3       
  • The version using NumChannels:
  10: checkcast     #12                 // class app/benchmark/scala/benchmark3b/NumChannels
  13: astore_3      
  14: aload_3       
  15: invokevirtual #16                 // Method app/benchmark/scala/benchmark3b/NumChannels.value:()I

One could assume that the first version should be faster. The 3rd version using Any results in the same as the first version.

Yet a micro benchmark using JMH shows no real difference:

Benchmark                             Mode   Samples         Mean   Mean error    Units 
a.b.s.benchmark3a.Benchmark3a.run    thrpt         5       42,352        0,480   ops/ms 
a.b.s.benchmark3b.Benchmark3b.run    thrpt         5       42,793        1,439   ops/ms 

Using Oracle JDK 1.8, Scala 2.10.3, Linux 32-Bit.


1st benchmark:

@State(Scope.Benchmark)
object BenchmarkState {
  final val n = 10000

  val input = 
    Array.range(0, n).map {
      n =>
        if (n % 2 == 0) {
          n
        } else {
          "" + n
        }
    }
}

class Benchmark3a {
  def test[A](x: A): Int = x match {
    case i: Int => i
    case _ => -1
  }

  @GenerateMicroBenchmark
  def run() = {
    var sum = 0
    var i = 0
    while (i < BenchmarkState.n) {
      sum += test(BenchmarkState.input(i))
      i +=1
    }
    sum
  }
}

2nd benchmark

case class NumChannels(value: Int)

@State(Scope.Benchmark)
object BenchmarkState {
  final val n = 10000

  val input = 
    Array.range(0, n).map {
      n =>
        if (n % 2 == 0) {
          NumChannels(n)
        } else {
          "" + n
      }
    }
}

class Benchmark3b {
  def test[A](x: A): Int = x match {
    case n: NumChannels => n.value
    case _ => -1
  }

  @GenerateMicroBenchmark
  def run() = {
    var sum = 0
    var i = 0
    while (i < BenchmarkState.n) {
      sum += test(BenchmarkState.input(i))
      i +=1
    }
    sum
  }
}

In previous versions I used Seq and methods map and sum, and both versions perform equally as well, but they only achieve around 4 ops/ms.

Even using Array and while does not reveal a real difference.

So I would say that this (isolated) API design decision won't affect performance.


Resources

Community
  • 1
  • 1
Beryllium
  • 12,808
  • 10
  • 56
  • 86
  • Thanks for the extensive test; which benchmark framework have you used here? – 0__ Jul 17 '14 at 18:48
  • JMH - fairly well hidden in the middle of the answer :-). I have added a link. As far as I know the sbt-jmh plugin currently only covers Scala based benchmarks (as for Scala/Java comparisons I use the maven-based approach) – Beryllium Jul 17 '14 at 19:05