2

I have a simple paired word counter problem in PySpark: This is the input as an RDD:

[' the adventure of the blue carbuncle  the adventure of the blue carbuncle  the adventure of the blue carbuncle ',' the adventure of the blue carbuncle']

I've already written a function that maps all the words pairs and gets an RDD output but it is a list of dictionaries for every string...

enter image description here

I just need to flatten the two dictionaries so the output is (of, blue), 4, not 3 in the first dictionary and 1 in the second. Tried all sorts of iterations of flatMap and reduceByKey and it's not working. Thanks!

pltc
  • 5,836
  • 1
  • 13
  • 31
Teddy
  • 21
  • 3

0 Answers0