I am currently using GraphFrames to retrieve connected components from a graph.
My code is very simple as follows:
v = sqlContext.createDataFrame(node,["id","name"])
print v.take(15)
e = sqlContext.createDataFrame(edge,["src","dst"])
print e.take(15)
g = GraphFrame(v,e)
# NullPointerException comes from connectedComponents function
res = g.connectedComponents()
Below is the output of the code snippet, which seems alright to me too.
Print Vertices:
[Row(id=6, name=u'6'), Row(id=12, name=u'12'), Row(id=1, name=u'1'), Row(id=3, name=u'3'), Row(id=9, name=u'9'), Row(id=2, name=u'2'), Row(id=11, name=u'11'), Row(id=10, name=u'10'), Row(id=5, name=u'5'), Row(id=4, name=u'4')]
Print Edges:
[Row(src=2, dst=9), Row(src=2, dst=5), Row(src=2, dst=6), Row(src=9, dst=10), Row(src=11, dst=12), Row(src=4, dst=10), Row(src=1, dst=2), Row(src=1, dst=3), Row(src=1, dst=12)]
However, when g.connectedComponents() is executed, the program starts to give the following NullPointerException.
Would appreciate any suggestions on what's going wrong here!
ERROR LiveListenerBus: Listener JobProgressListener threw an exception java.lang.NullPointerException at org.apache.spark.ui.jobs.JobProgressListener$$anonfun$onTaskEnd$1.apply(JobProgressListener.scala:361) at org.apache.spark.ui.jobs.JobProgressListener$$anonfun$onTaskEnd$1.apply(JobProgressListener.scala:360) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32) at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:45) at org.apache.spark.ui.jobs.JobProgressListener.onTaskEnd(JobProgressListener.scala:360) at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:42) at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55) at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37) at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80) at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64) at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1183) at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)