22

This code is from a Scala Worksheet:

case class E(a: Int, b: String)

val l = List(
    E(1, "One"),
    E(1, "Another One"),
    E(2, "Two"),
    E(2, "Another Two"),
    E(3, "Three")
)

l.groupBy(x => x.a)                             
// res11: scala.collection.immutable.Map[Int,List[com.dci.ScratchPatch.E]] =
//    Map(
//      2 -> List(E(2,Two), E(2,Another Two)),
//      1 -> List(E(1,One), E(1,Another One)),
//      3 -> List(E(3,Three))
//    )

You will notice that groupBy returns a map, but that the ordering of the elements are now different to the way they were before. Any idea why this happens, and what the best way is to avoid this?

fresskoma
  • 25,481
  • 10
  • 85
  • 128
Jack
  • 16,506
  • 19
  • 100
  • 167
  • 1
    Hi @JacobusR. Your question is well written compared to many other questions on SO, but I hope you'd consider putting more effort into the code formatting next time. If you ask me, it was almost completely unreadable, and I'm generally more inclined to answer a question if I don't have to spend a minute "parsing" the code first :) – fresskoma Jan 21 '13 at 08:53
  • Hi @x3ro, a million apologies! It completely slipped my mind. I was busy writing the question and someone walked into my office, so I just posted it. – Jack Jan 21 '13 at 09:20
  • 1
    Although the resulting groups will have elements in the same order as the original list. Is that guaranteed? In other words could we have 2 -> List( E(2, Another Two), E(2, Two)) – Ustaman Sangat May 21 '15 at 22:14

2 Answers2

21

Unless you specifically use a subtype of SortedMap, a map (like a set) is always in an unspecified order. Since "groupBy" doesn't return a SortedMap but only a general immutable.Map and also doesn't use the CanBuildFrom mechanism, I think there's nothing that you can do here.

You can find more on this topic in answers to similar questions, e.g. here.

Edit:

If you want to convert the map afterwarts to a SortedMap (ordered by its keys), you can do SortedMap(l.groupBy(_.a).toSeq:_*) (with import scala.collection.immutable.SortedMap). Don't do ...toSeq.sortWith(...).toMap because that will not guarantee the ordering in the resulting map.

Community
  • 1
  • 1
  • Thanks. To order the map I used l.groupBy(\_.a).toSeq.sortWith(_._1 < _._1).toMap. Not very elegant, but hey, it's Monday. – Jack Jan 21 '13 at 08:20
  • 10
    Rant: highly inconvenient going to the trouble of sorting a database query only to have groupBy completely unravel the ordering, forcing one to SortedMap it back together again. Particularly painful when you have collections of collections that you need to groupBy on. End rant, groupBy is also powerful and immensely useful. – virtualeyes Jan 21 '13 at 13:41
  • I hope groupBy method will improve in the future because conversion is inefficient. – Anton Kuzmin May 21 '14 at 19:53
  • @virtualeyes did you ever find a better solution? – Andy Hayden Sep 08 '16 at 20:14
11

I run into this all the time when dealing with database records. The database sorts them by some key but then groupBy undoes it! So I've started pimping the Sequence class with a function that groups by consecutive equal keys:

class PimpedSeq[A](s: Seq[A]) {

  /**
   * Group elements of the sequence that have consecutive keys that are equal.
   *
   * Use case:
   *     val lst = SQL("SELECT * FROM a LEFT JOIN b ORDER BY a.key")
   *     val grp = lst.groupConsecutiveKeys(a.getKey)
   */
  def groupConsecutiveKeys[K](f: (A) => K): Seq[(K, List[A])] = {
    this.s.foldRight(List[(K, List[A])]())((item: A, res: List[(K, List[A])]) =>
      res match {
        case Nil => List((f(item), List(item)))
        case (k, kLst) :: tail if k == f(item) => (k, item :: kLst) :: tail
        case _ => (f(item), List(item)) :: res
      })
  }
}

object PimpedSeq {
  implicit def seq2PimpedSeq[A](s: Seq[A]) = new PimpedSeq(s)
}

To use it:

import util.PimpedSeq._   // implicit conversion    
val dbRecords = db.getTheRecordsOrderedBy
val groups = dbRecords.groupConsecutiveKeys(r => r.getKey)
bwbecker
  • 1,031
  • 9
  • 21