I just started to learn py2neo and neo4j, and I'm having this problem of duplicates. I'm writing a simple python script in python that will make a database of scientific papers and authors. I only need to add the nodes of papers and authors and add the their relationship. I was using this code, that works fine but is very slow:
paper = Node('Paper', id=post_id)
graph.merge(paper)
paper['created_time'] = created_time
graph.push(paper)
for author_id,author_name in paper_dict['authors']:
researcher = Node('Person', id=author_id)
graph.merge(researcher)
researcher['name'] = author_name
graph.push(researcher)
wrote = Relationship(researcher,'author', paper)
graph.merge(wrote)
So, in order to write multiple relationships at the same time, I'm trying to use transaction. My problem is that if I run this multiple times for the same papers and authors, it assumes that they are different entities and then duplicates each node and relationship in the database (I tried to run the scrip multiple times). But the same doesn't happen with the previous code. This is the code that uses transactions:
tx = graph.begin()
paper = Node('Paper', id=post_id)
paper['created_time'] = created_time
tx.create(paper)
for author_id,author_name in paper_dict['authors']:
researcher = Node('Person', id=author_id)
researcher['name'] = author_name
tx.create(researcher)
wrote = Relationship(researcher,'author', paper)
tx.create(wrote)
tx.commit()