2

I am trying to implement a generic method to compute the mean of any kind of sequence (for example: List, Array) which contains any kind of numeric values (Int, Float, Double...), like this:


 def mean[T <: Numeric[T]](data:Seq[T])(implicit number: Numeric[T]): T = {
      data.foldLeft(number.zero)(number.plus) / data.size
  }

However, the division operation cannot be resolved. That is because the Numeric type does not have this operation defined (from the ScalaDoc). I want to convert it to double before proceeding with the division, but the method toDouble(x:T) from Numeric type expects a param. I have seen there is a type member for the Numeric[T] called NumericOps that does implement the toDouble method without receiving any param. Could I call this method.. somehow?

Cristina HG
  • 660
  • 5
  • 15
  • 2
    You maye `number.toDouble(data.foldLeft(number.zero)(number.plus))` or `import number._` so you can `data.foldLeft(zero)(_ + _).toDouble` - However, that is a wrong approach, because you are changing the **precision** of the input. You either have to chose [**Integral**](https://www.scala-lang.org/api/current/scala/math/Integral.html) or [**Fractional**](https://www.scala-lang.org/api/current/scala/math/Fractional.html) instead of **Numeric**. Both will provide division, for a subset of numeric types. – Luis Miguel Mejía Suárez Sep 22 '19 at 16:20
  • Check [Generic Numeric division](https://stackoverflow.com/questions/40351176/generic-numeric-division) – Vüsal Sep 22 '19 at 16:30
  • @LuisMiguelMejíaSuárez But how can I use the Fractional type without using `toDouble` too ? I tried: def mean[T <: Fractional[T]](data:Seq[T])(implicit number: Fractional[T]): Double= { data.foldLeft(number.zero)(number.plus).div(data.size) } but it still needs the toDouble to perform division – Cristina HG Sep 22 '19 at 16:33
  • Possible duplicate of [How to implement generic average function in scala?](https://stackoverflow.com/questions/10160280/how-to-implement-generic-average-function-in-scala) – Mario Galic Sep 23 '19 at 18:59
  • @CristinaHG Consider solution by [Kolmar](https://stackoverflow.com/a/58069625/5205022) which does not result in loss of precision mentioned by Luis. – Mario Galic Sep 23 '19 at 21:06

3 Answers3

4

Here is an example using Fractional, it will preserve the correct precision of the input numbers, and does only one traversal of the data. However, do note that this only works for types that have a "precise" division, like Float, Double & BigDecimal. But does not work for numeric types like Int or Long.

def mean[T](data: Iterable[T])(implicit N: Fractional[T]): T = {
  import N._

  val remaining = data.iterator

  @annotation.tailrec
  def loop(sum: T, count: Int): T =
    if (remaining.hasNext)
      loop(sum + remaining.next(), count + 1)
    else if (count == 0)
      zero
    else
      sum / fromInt(count)

  loop(zero, 0)
}

This was tested on Scala 2.13.

3

If Double precision is sufficient, try

def mean[T](data: Seq[T])(implicit number: Numeric[T]): Double = {
  import number._
  val sum = data.foldLeft(zero)(plus) 
  toDouble(sum) / data.size
}

mean(Seq(1,2,3,4)) // 2.5

or using Fractional (but it will not work for Ints)

def mean[T](data: Seq[T])(implicit number: Fractional[T]): T = {
  import number._
  val sum = data.foldLeft(zero)(plus)
  div(sum, fromInt(data.size))
}

mean(Seq(1.0,2,3,4)) // 2.5
mean(Seq(1,2,3,4))   // error: could not find implicit value for parameter number: Fractional[Int]
Mario Galic
  • 47,285
  • 6
  • 56
  • 98
  • 1
    You can just `sum.toDouble` Anyways, see my other comment of why this is a bad idea. – Luis Miguel Mejía Suárez Sep 22 '19 at 16:23
  • @LuisMiguelMejíaSuárez Does the loss of precision happen only if `BigDecimal` is passed in? – Mario Galic Sep 22 '19 at 17:25
  • Loss in the sense of going from more to less, yes. But I was referring to loss in the sense of using a different one. Usually one thinks that moving from **float** to **double** would not be problematic. But since the return type has to be `T` then you have to downcast the division to `T` again, which could lead to a different result than if the whole operation would be done using **Floats**. But yeah, for most code that should not matter, but I like to try to keep everything in the same precision format, just to be sure. – Luis Miguel Mejía Suárez Sep 22 '19 at 17:29
  • You're missing all the `CanBuildFrom` fun, this will only work with `Seq` :) – flavian Sep 22 '19 at 21:46
  • 1
    @flavian well since the collection is only being consumed, no `CanBuildFrom` is required at all. It can be made more generic by using **Iterable** in `2.13` _(like my answer)_ or a **TrasverableOnce** in `2.12`. – Luis Miguel Mejía Suárez Sep 22 '19 at 22:43
  • @LuisMiguelMejíaSuárez I know I know been waiting for it for a while. – flavian Sep 23 '19 at 17:55
  • @flavian Sorry, I did not understand what you meant with your message. – Luis Miguel Mejía Suárez Sep 23 '19 at 17:59
  • @LuisMiguelMejíaSuárez The simplification of the collections lib that finally arrived in 2.13 – flavian Sep 23 '19 at 18:50
2

Unless you have to use Numeric why not just use Fractional, as that adds a div operation to Numeric.

You may find this of interest as they talk about the different options:

https://stackoverflow.com/a/40351867/67566

James Black
  • 41,583
  • 10
  • 86
  • 166