4

I am new in Spark/Scala. I know how to load CSV files:

    sqlContext.read.format("csv")

and how to read text streams and file streams:

    scc.textFileStream("""file:///c:\path\filename""");
    scc.fileStream[LongWritable, Text, TextInputFormat](...)

but how to read text stream in CSV format? Thanks, Levi

mfirry
  • 3,634
  • 1
  • 26
  • 36
Levi
  • 83
  • 2
  • 8

2 Answers2

6

Here you go:

val ssc = new StreamingContext(sparkConf, Seconds(5))


    // Create the FileInputDStream on the directory
    val lines = ssc.textFileStream("file:///C:/foo/bar")

    lines.foreachRDD(rdd => {
        if (!rdd.isEmpty()) {
          println("RDD row count: " + rdd.count())
         // Now you can convert this RDD to DataFrame/DataSet and perform business logic.  

        }
      }
    })

    ssc.start()
    ssc.awaitTermination()
  } 
Sudheer Palyam
  • 2,499
  • 2
  • 23
  • 28
-1

You can stream your Csv file easily by using spark 2.2 structured streaming.

You can refer here

Naman Agarwal
  • 614
  • 1
  • 8
  • 28