22

I was browsing around and found a question about grouping a String by it's characters, such as this:

The input:

"aaabbbccccdd"

Would produce the following output:

"aaa"
"bbb"
"cccc"
"ddd"

and I found this suggestion:

val str = "aaabbbccccdd"[
val list = str.groupBy(identity).toList.sortBy(_._1).map(_._2)

And this identity fellow got me curious. I found out it is defined in PreDef like this:

identity[A](x: A): A

So basically it returns whatever it is given, right? but how does that apply in the call to groupBy?

I'm sorry if this is a basic question, is just that functional programming is still tangling my brains a little. Please let me know if there's any information I can give to make this question clearer

Rodrigo Sasaki
  • 7,048
  • 4
  • 34
  • 49

6 Answers6

14

This is your expression:

val list = str.groupBy(identity).toList.sortBy(_._1).map(_._2)

Let's go item by function by function. The first one is groupBy, which will partition your String using the list of keys passed by the discriminator function, which in your case is identity. The discriminator function will be applied to each character in the screen and all characters that return the same result will be grouped together. If we want to separate the letter a from the rest we could use x => x == 'a' as our discriminator function. That would group your string chars into the return of this function (true or false) in map:

 Map(false -> bbbccccdd, true -> aaa)

By using identity, which is a "nice" way to say x => x, we get a map where each character gets separated in map, in your case:

Map(c -> cccc, a -> aaa, d -> dd, b -> bbb)

Then we convert the map to a list of tuples (char,String) with toList.

Order it by char with sortBy and just keep the String with the map getting your final result.

Vinicius Miana
  • 2,047
  • 17
  • 27
13

To understand this just call scala repl with -Xprint:typer option:

val res2: immutable.Map[Char,String] = augmentString(str).groupBy[Char]({
   ((x: Char) => identity[Char](x))
});

Scalac converts a simple String into StringOps with is a subclass of TraversableLike which has a groupBy method:

def groupBy[K](f: A => K): immutable.Map[K, Repr] = {
    val m = mutable.Map.empty[K, Builder[A, Repr]]
    for (elem <- this) {
      val key = f(elem)
      val bldr = m.getOrElseUpdate(key, newBuilder)
      bldr += elem
    }
    val b = immutable.Map.newBuilder[K, Repr]
    for ((k, v) <- m)
      b += ((k, v.result))

    b.result
  }

So groupBy contains a map into which inserts chars return by identity function.

4lex1v
  • 21,367
  • 6
  • 52
  • 86
8

First, let's see what happens when you iterate over a String:

scala> "asdf".toList
res1: List[Char] = List(a, s, d, f)

Next, consider that sometimes we want to group elements on the basis of some specific attribute of an object.

For instance, we might group a list of strings by length as in...

List("aa", "bbb", "bb", "bbb").groupBy(_.length)

What if you just wanted to group each item by the item itself. You could pass in the identity function like this:

List("aa", "bbb", "bb", "bbb").groupBy(identity)

You could do something silly like this, but it would be silly:

List("aa", "bbb", "bb", "bbb").groupBy(_.toString)
Larsenal
  • 49,878
  • 43
  • 152
  • 220
3

Take a look at

str.groupBy(identity)

which returns

scala.collection.immutable.Map[Char,String] = Map(b -> bbb, d -> dd, a -> aaa, c -> cccc)

so the key by which the elements are grouped by is the character.

Reactormonk
  • 21,472
  • 14
  • 74
  • 123
1

Whenever you try to use methods such as groupBy on the String. It's important to note that it is implicitly converted to StringOps and not List[Char].

StringOps

The signature of groupBy is given by-

def groupBy[K](f: (Char) ⇒ K): Map[K, String]

Hence, the result is in the form -

Map[Char,String]

List[Char]

The signature of groupBy is given by-

def groupBy[K](f: (Char) ⇒ K): Map[K, List[Char]]

If it had been implicitly converted to List[Char] the result would be of the form -

Map[Char,List[Char]]

Now this should implicitly answer your curious question, as how scala figured out to groupBy on Char (see the signature) and yet give you Map[Char, String].

Shrey
  • 2,374
  • 3
  • 21
  • 24
  • Very interesting point. But how would I know exactly what it does? Does it group by each individual char in the `String`, avoiding repetitions? – Rodrigo Sasaki Oct 03 '13 at 18:44
0

Basically list.groupBy(identity) is just a fancy way of saying list.groupBy(x => x), which in my opinion is clearer. It groups a list containing duplicate items by those items.

Alvaro Mendez
  • 134
  • 2
  • 13