2

I want to create multiple neo4j nodes and relationships in one Cypher transaction. I'm using py2neo which allows issuing multiple Cypher statements in one transaction .

I thought I'd add a statement for each node and relationship I create:

tx.append('CREATE (n:Label { prop: val })')
tx.append('CREATE (m:Label { prop: val2 })')

Now I want to create a relationship between the two created nodes:

tx.append('CREATE (n)-[:REL]->(m)')

This doesn't work as expected. No relationship is created between the first two nodes, since there's no n or m in the context of the last statement (there is a new relationship between two new nodes - four nodes are created in total)

Is there a way around this? Or should I combine all the calls to CREATE (around 100,000 per logical transaction) in one statement?

It just hurts my brain thinking about such a statement, because I'll need to store everything on one big StringIO, and I lose the ability to use Cypher query parameters - I'll need to serialize dictionaries to text myself.

UPDATE: The actual graph layout is more complicated than that. I have multiple relationship types, and each node is connected to at least two other nodes, while some nodes are connected to hundreds of nodes.

zmbq
  • 38,013
  • 14
  • 101
  • 171
  • 1
    It won't fail, it will just create two nodes with no labels or properties. The relationship will still be created. – Nicole White Dec 09 '15 at 20:37
  • OK, yes, this doesn't work as expected... – zmbq Dec 09 '15 at 20:53
  • That's because identifiers are only relevant within the scope of the query: http://stackoverflow.com/a/34074151/2848578 – Nicole White Dec 09 '15 at 21:01
  • 1
    Yes, I know that, I was hoping there was some way around this without creating one huge query. – zmbq Dec 09 '15 at 21:02
  • 2
    If you are doing a large data import consider using [`LOAD CSV`](http://neo4j.com/docs/stable/query-load-csv.html). See http://stackoverflow.com/questions/34118491/fastest-way-to-import-to-neo4j/34143376#34143376 for an example. Also see http://stackoverflow.com/questions/34124759/batch-loading-neo4j/34140006#34140006 for an example using py2neo `WriteBatch` – William Lyon Dec 09 '15 at 21:53
  • Yes, I think I'll need to go the CSV way for performance reasons, and just forget about transactions. – zmbq Dec 09 '15 at 21:54

2 Answers2

2

You don't need multiple queries. You can use a single CREATE to create each relationship and its related nodes:

tx.append('CREATE (:Label { prop: val })-[:REL]->(:Label { prop: val2 })')
cybersam
  • 63,203
  • 6
  • 53
  • 76
  • The example I've given is over-simplified. I have numerous relationships between nodes, I can't create them with one create statement. – zmbq Dec 09 '15 at 20:36
0

Do something like this:

rels = [(1,2), (3,4), (5,6)]

query = """
CREATE (n:Label {prop: {val1} }),
       (m:Label {prop: {val2} }),
       (n)-[:REL]->(m)
"""

tx = graph.cypher.begin()

for val1, val2 in rels:
    tx.append(query, val1=val1, val2=val2)

tx.commit()

And if your data is large enough consider doing this in batches of 5000 or so.

Nicole White
  • 7,720
  • 29
  • 31