0

I search the best way (i don't find this into current api, but perhaps i mistake) to compute different type of ranking for scala collection like IndexedSeq (like this different strategies in R : http://stat.ethz.ch/R-manual/R-devel/library/base/html/rank.html )

val tabToRank = IndexedSeq(3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5)

For example, "first rank strategy" equal to first occurence win, return

tabToRank.rank("first")
# return (4,1,6,2,7,11,3,10,8,5,9)

For example, i have this case of study : if you have a list of city with population (a vector data like tabToRank) at final state of simulation, i need to a) rank and b) sort cities by rank to plot a graphic like "rank of city by population" equal to the well know rank size distribution (src of img) :

a rank size distribution

reyman64
  • 523
  • 4
  • 34
  • 73
  • Perhaps you could briefly explain the concept of ranking for people (like me) who are not familiar with it. I only know what's a rank of a matrix, not a vector of numbers. – Petr Oct 11 '12 at 13:30
  • I'm not sure if I get it, I'm merely guessing: is the result supposed to be a permutation, showing how to reorder the input sequence so that it is sorted? If so, how the different strategies fit in? – Petr Oct 11 '12 at 14:58

2 Answers2

2

For the city data, you want

citipop.sortBy(x => -x).zipWithIndex.map(_.swap)

to first sort the populations largest first (default is smallest first, so we sort the negative), then number them, and then get the number first and the population second.

Scala doesn't have a built-in statistical library, however. In general, you'll have to know what you want to do and do it yourself or use a Java library (e.g. Apache Commons Math).

Rex Kerr
  • 166,841
  • 26
  • 322
  • 407
1

Here is a piece code that does what you gave as an example:

object Rank extends App {
  val tabToRank = IndexedSeq(3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5);

  def rank[A](input: Seq[A])(implicit ord: Ordering[A]): Seq[Int] = {
    // transform to a pair of value/index
    val withIndices: Seq[(A,Int)] = input.zipWithIndex;
    // sort by the values
    val sorted: Seq[(A,Int)] = withIndices.sortBy(_._1);
    // keep only the indices
    val indices = sorted.map(_._2);
    // create the inverse permutation
    val r = new collection.mutable.ArraySeq[Int](indices.size);
    for((i,j) <- indices.zipWithIndex)
      r(i) = j;
    return r;
  }

  println(rank(tabToRank));
}

It:

  • annotates the elements with their indices,
  • sorts it according the values
  • throws away the values, keeping just the indices
  • and inverses the permutation to get the map you need.

(Note that it counts from 0 rather from 1, as basically all programming languages do.)

I don't understand the other stuff (strategies) to include it into it.

Petr
  • 62,528
  • 13
  • 153
  • 317