I know Spark will not allow you to use functions that generate RDDs inside of map
or any of it's variants. Is there a work around for this? For instance, can I perform a standard looping iterations of all RDDs in a partition. (For instance is there a method to convert an RDD to a list on each node, so that each node contains a list of the entries it was carrying?)
I'm trying to do some graph work with graphframes
in pyspark
and it's currently not possible to do what I want.