I have a relational database and I converted tables to csv files. I imported 2 of them, and create the nodes by specifying the columns to be picked as in following code:
import csv
from py2neo import neo4j, authenticate, Graph, Node, cypher, rel, Relationship
authenticate("localhost:7474", "neo4j", "my_password")
graph_db = Graph()
graph_db.delete_all()
"""import all rows and columns of csv files"""
with open('File1.csv', "rb") as abc_file, open('File2.csv', "rb") as efg_file:
data1 = csv.reader(abc_file, delimiter=';')
data2 = csv.reader(efg_file, delimiter=';')
data1.next()
data2.next()
"""Create the nodes for the all the rows of "Contact Email" column of abc_file"""
rownum = 0
for row in abc_file:
nodes1 = Node("Contact_Email", email=row[0])
contact_graph = graph_db.create(nodes1)
"""Create the nodes for the all the rows of "Building_Name" and "Person_Created"
columns of efg_file"""
rownum = 0
for row in efg_file:
nodes2 = Node("Building_Name", name=row[0])
nodes3 = Node("Person_Created", name=row[1])
building_graph = graph_db.create(nodes2, nodes3)
Let's say there are 60 emails under "Contact_Email" column of "File1.csv" which is the Primary_Key. It is used as Foreign_Key in "File2.csv" under "Person_Created" column. There 14 buildings specified under "Building Name" with corresponding emails in "Person_Created" columns. My Question is:
1) How can I match the 14 emails in File2.csv "Person_Created" column with the emails in File1.csv "Contact Email" column to avoid duplicates
2) and How can I create a relationship between the "Building Names" (in File2.csv) and "Person_Created" (in File1.csv) without any duplication.. sth like "Building1234 is DESIGNED_BY abc@xyz.com"
How can I do it in py2neo with/without cypher?