3
def ngrams(n: Int, words: Array[String]) = {
// exclude 1-grams
(1 to n).map { i => words.sliding(i).toStream }
  .foldLeft(Stream[Array[String]]()) {
    (a, b) => a #::: b
  } }   
scala> val op2 =  ngrams(3, "how are you".split(" ")).foreach { x => println((x.mkString(" ")))}  
Output as :    
how
are
you
how are
are you
how are you
op2: Unit = ()

How to avoid the above Unit value , actually i wants to convert them to Set, because of the Unit=(), it's failing . So can you please help in output should be Set(how,are,you,how are,are you,how are you) Thanks for the post -- How to generate n-grams in scala?.

Community
  • 1
  • 1
MapReddy Usthili
  • 288
  • 1
  • 7
  • 23

2 Answers2

2

The short answer is that the return type of foreach is Unit. So when you assign the output of foreach to op2, the type of op2 is Unit and its value is ().

It sounds like what you want to do is the following:

  1. calculate the n-grams using the ngrams method,
  2. store the Set of n-grams to op2, and
  3. print out all the n-grams.

Let's start with the type of the ngrams method:

(n: Int, words: Array[String]) => Stream[Array[String]]

It returns a Stream, which looks like it can easily be turned into a Set with toSet:

ngrams(3, "how are you".split(" ")).toSet

However, this is dangerous because in scala, Array equality is done by reference. It is much safer to turn your Stream[Array[String]] into a Stream[List[String]] so that any duplicates are removed (this is assuming that order matters in each ngram):

val op2 = ngrams(3, "how are you".split(" ")).map(_.toList).toSet

Now, it's easy to print out the Set[List[String]] the same way you did the Stream[Array[String]]:

op2.foreach { x => println((x.mkString(" ")))}

Because the result is (), the Unit type, there is no reason to assign it to a variable.

Ben Reich
  • 16,222
  • 2
  • 38
  • 59
Dan Gallagher
  • 1,022
  • 9
  • 18
  • @MapReddy – note that Dan answered the question, not me. I just edited it a bit, so you really have him to thank! Thank him with an upvote :) – Ben Reich May 18 '15 at 13:01
0

It's the type signature for op2. You could do

  1. remove the assignment to Op2

ngrams(3, "how are you".split(" ")).foreach { x => println((x.mkString(" ")))}

  1. Change .foreach to .map and the call op2 for the result.

scala> val op2 = ngrams(3, "how are you".split(" ")).map { x => x.mkString(" ")}.toList

scala> op2

korefn
  • 955
  • 6
  • 17