0

In my following block of code, sometime a DataFrame is created, sometime it is not.

It seems like the issue is due to the fact that neo4j is racing to execute the command. I have tried to split the run commands into multiple session like the code below, I have also tried to include all of them into 1 session. Yet the consistency of the number of time something is return is still low. How can I stop this eager operation?

set_label_query = """
MATCH (s:Startup)
WHERE "{vertical_original}" IN s.Verticals
WITH s
MATCH (s)<-[:INVESTOR_INVESTED_IN]-(i:Investor)
WITH s, i
MATCH(i)<-[:MADE_LP_COMMITMENT_TO_VC]-(l:Limited_Partner)
SET i:{vertical}, l:{vertical}, s:{vertical}
RETURN COUNT(i)
;
"""
create_gds_project_query = '''
CALL gds.graph.project(
        'climate_cleantech_undirected',
        ['{vertical}', 'Limited_Partner', 'Investor', 'Startup'],
        {{INVESTOR_INVESTED_IN: {{orientation: 'UNDIRECTED'}},
        MADE_LP_COMMITMENT_TO_VC: {{orientation: 'UNDIRECTED'}}
        }}
        );
'''

create_rank_query = '''
CALL gds.pageRank.stream('climate_cleantech_undirected', {{
            nodeLabels:['{vertical}'] ,
            maxIterations: 20,
            dampingFactor: 0.85
        }})
        YIELD nodeId, score
        WITH gds.util.asNode(nodeId) AS node, score
        WHERE 'Investor' IN labels(node)
        RETURN node.Name, node.Website, score
        ORDER BY score DESC;
'''

remove_graph_query = "CALL gds.graph.drop('climate_cleantech_undirected', false)"
with neo4j_driver.session() as session:
    with session.begin_transaction() as tx:
        tx.run(set_label_query.format(vertical=vertical))
        tx.run(create_gds_project_query.format(vertical=vertical))
        result_data = tx.run(create_rank_query.format(vertical=vertical)).data()
        df = pd.DataFrame(result_data)
        print(df)
        tx.commit()
print('execute 2')
with neo4j_driver.session() as session:
    with session.begin_transaction() as tx:
        tx.run(remove_label_query.format(vertical=vertical))
        tx.run(remove_graph_query)
        tx.commit()

result when something is return: enter image description here

le Minh Nguyen
  • 241
  • 1
  • 10
  • Your question needs to show your Cypher code, and also show what difference there is in `result_data` when a DataFrame is created versus when it is not. This is not a race condition issue, since your code is executing synchronously. – cybersam Apr 12 '23 at 15:34
  • Hi, thank you for the suggestion. I have edited it as per your suggestion – le Minh Nguyen Apr 12 '23 at 15:41

1 Answers1

1

It looks like set_label_query expects {vertical_original} to be replaced with some value, but your code is only replacing {vertical} with a value. Therefore, set_label_query ends up not doing anything.

cybersam
  • 63,203
  • 6
  • 53
  • 76