0

I am retrieving the data from Neo4j using Bolt Driver in Python Language. The returned result should be stored as RDD(or atleast into CSV). I am able to see the returned results but unable to store it as an RDD or a Data frame or atleast into a csv.

Here is how I am seeing the result:

session = driver.session()
result = session.run('MATCH (n) RETURN  n.hobby,id(n)')  
session.close()     

Here, how can I store this data into RDD or CSV file.

Jack Daniel
  • 2,527
  • 3
  • 31
  • 52

2 Answers2

0

I deleted the old post and reposted the same question. But I haven't received any pointers. So, I am posting my way of approach so that it may help others.

'''
Storing the return result into RDD
'''

session = driver.session()
result = session.run('MATCH (n:Hobby) RETURN  n.hobby AS hobby,id(n) As id LIMIT 10')  
session.close()     

'''
Pulling the keys
'''
keys = result.peek().keys()

'''
Reading all the property values and storing it in a list
'''
values=list()

for record in result:
    rec= list()
    for key in keys:
        rec.append(record[key])
    values.append(rec)

'''
Converting list of values into a pandas dataframe
'''
df = DataFrame(values, columns=keys)     
print df  

'''
Converting the pandas DataFrame to Spark DataFrame
'''  
sqlCtx = SQLContext(sc)
spark_df = sqlCtx.createDataFrame(df)

print spark_df.show()

'''
Converting the Pandas DataFrame to SparkRdd (via Spark Dataframes)
'''
rdd = spark_df.rdd.map(tuple)

print rdd.take(10)

Any suggestions to improve the efficiency is highly appreciated.

Jack Daniel
  • 2,527
  • 3
  • 31
  • 52
0

Instead of going from python to spark, why not use the Neo4j Spark connector? I think this would save python from being a bottle neck if you were moving a lot of data. You can put your cypher query inside of the spark session and save it as an RDD.

There has been talk on the Neo4J slack group about a pyspark implementation, which will hopefully be available later this fall. I know the ability to query neo4j from pyspark and sparkr would be very useful.