5

Suppose I have the text file like this:

Apple#mango&banana@grapes

The data needs to be split on multiple delimiters before performing the word count.

How to do that?

Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121
Ankita
  • 480
  • 1
  • 6
  • 18

2 Answers2

10

Use split method:

scala> "Apple#mango&banana@grapes".split("[#&@]")
res0: Array[String] = Array(Apple, mango, banana, grapes)
Alper t. Turker
  • 34,230
  • 9
  • 83
  • 115
3

If you just want to count words, you don't need to split. Something like this will do:

  val numWords = """\b\w""".r.findAllIn(string).length

This is a regex that matches start of a word (\b is a (zero-length) word boundary, \w is any "word" character (letter, number or underscore), so you get all the matches in your string, and then just check how many there are.

If you are looking to count each word separately, and do it across multiple lines, then, split is, probably, a better option:

    source
      .getLines
      .flatMap(_.split("\\W+"))
      .filterNot(_.isEmpty)
      .groupBy(identity)
      .mapValues(_.size)
Dima
  • 39,570
  • 6
  • 44
  • 70