I'm not sure if you are violating any rules here by asking the same question two times.
To detect communities with graphframes, at first you have to create graphframes object. Give your example dataframe the following code snippet shows you the necessary transformations:
from graphframes import *
sc.setCheckpointDir("/tmp/connectedComponents")
l = [
( '600060' , '09283744'),
( '600131' , '96733110'),
( '600194' , '01700001')
]
columns = ['valx', 'valy']
#this is your input dataframe
edges = spark.createDataFrame(l, columns)
#graphframes requires two dataframes: an edge and a vertice dataframe.
#the edge dataframe has to have at least two columns labeled with src and dst.
edges = edges.withColumnRenamed('valx', 'src').withColumnRenamed('valy', 'dst')
edges.show()
#the vertice dataframe requires at least one column labeled with id
vertices = edges.select('src').union(edges.select('dst')).withColumnRenamed('src', 'id')
vertices.show()
g = GraphFrame(vertices, edges)
Output:
+------+--------+
| src| dst|
+------+--------+
|600060|09283744|
|600131|96733110|
|600194|01700001|
+------+--------+
+--------+
| id|
+--------+
| 600060|
| 600131|
| 600194|
|09283744|
|96733110|
|01700001|
+--------+
You wrote in the comments of your other question that the community detection algorithmus doesn't matter for you currently. Therefore I will pick the connected components:
result = g.connectedComponents()
result.show()
Output:
+--------+------------+
| id| component|
+--------+------------+
| 600060|163208757248|
| 600131| 34359738368|
| 600194|884763262976|
|09283744|163208757248|
|96733110| 34359738368|
|01700001|884763262976|
+--------+------------+
Other community detection algorithms (like LPA) can be found in the user guide.