Performing operations on RDD [(LongWritable),(JsonObject)]

Asked Mar 14 '17 at 12:56

Active Mar 14 '17 at 13:47

Viewed 82 times

My task is basically:

Till now, I am able to read data from BigQuery using newAPIHadoopRDD() which returns RDD[(LongWritable, JsonObject)].

tableData.map(entry => (entry._1.toString(),entry._2.toString()))
  .take(10)
  .foreach(println)

And below is the sample data,

(341,{"id":"4","name":"Shahu","score":"100"})

I am not able to figure out what functions should I use on this RDD to meet requirement.

Do I need to convert this RDD to DataFrame/Dataset/JSON format? and How?

edited Mar 14 '17 at 13:47

Tzach Zohar

asked Mar 14 '17 at 12:56

Shawn

1

What exactly you want to update? The map/reduce sometimes can be replaced with a UDF function inside BQ. – Pentium10 Mar 14 '17 at 13:02
Update the data of BigQuery table through Spark/Scala. – Shawn Mar 16 '17 at 06:27

0 Answers0