I have a function that takes a string and extract values from string using sub-string and query the Cassandra table using these values.
def formatInputString(line: String) = {
// extract values from line using sub-string and query Cassandra table.
}
If I pass the values by reading text file using Source.fromFile, It works (prints the result from Cassandra)...
// using Scala getLine()
for (line <- Source.fromFile("file.txt").getLines()) {
formatInputString(line)
}
But it just hangs up if use Spark RDD like this...
// using Spark RDD
val line = sc.textFile("file.txt")
val lst = line.map(formatInputString)
Can somebody explain this behaviour and how to get around this (I need to use RDD version).