0

I am using Transform API of Dstream(Spark Streaming) to sort the data. I am reading from TCP socket using netcat. Following the line of code used: myDStream.transform(rdd=>rdd.sortByKey())

It is unable to find function sortByKey. Could anyone please help what is the issue in this step?

1 Answers1

1

If you use netcat as an input, you're likely to use socketTextStream which returns ReceiverInputDStream[String]. In that case transform will take a function:

(RDD[String]) => RDD[U]

Only RDD[(T, U)], where T has corresponding Orderign can be sortedByKey. For other RDD you can use sortBy:

myDSTream.transform(rdd => rdd.sortBy(x => x))