0

I have two Key Value Pairs of the type org.apache.spark.streaming.dstream.DStream[Int].

First Key value pair is (word,frequency). Second key value pair is (Number of rows,Value).

I would like to divide frequency by value in for each word. But, I am getting below error value / is not a member of org.apache.spark.streaming.dstream.DStream[Int]

Sample Code :

f is frequency of the word and c is the total count rdd has word and frequency

val cp = rdd.foreachRDD {
  x => (x, f/c)
}
Yuval Itzchakov
  • 146,575
  • 32
  • 257
  • 321
neoguy
  • 61
  • 8

1 Answers1

0

First apply map transformation on the DStream object and then inside that you will get RDD now you apply map transformation on RDD object as follow

dStream.map{rdd=>
 rdd.map(x=>(x,f/c))
}

if f is the object of DStream then collect it first before use it in RDD or DStream closure.

Sandeep Purohit
  • 3,652
  • 18
  • 22
  • Yes F is an object of Dstream. But when I try to collect it says value collect is not a member of org.apache.spark.streaming.dstream.DStream[Int] – neoguy Oct 24 '16 at 10:59
  • I m not suggested this as a good solution but one thing you can do you cab save f DStream in file with saveAsTextFiles action and then read that text file as RDD and collect it n then use that as value of f – Sandeep Purohit Oct 24 '16 at 11:03