0

I have a service that receives tasks and upon completion it Instantiate a class named SubmitPy2neo which inherits from Thread class to insert the data to my neo4j database using py2neo in python. For each task I create a seperate thread to make things parallel and insert data faster to the neo4j database. However when things go multithreaded I get this error popping up

Exception KeyError: (u'http://graphdb_server/db/data/relationship/123123') in ignored.

Any idea how can I insert these tasks into the Graph database using py2neo in a multithreaded way ??

The multi threaded class is the following

class Send_GraphDatabase(threading.Thread):
    def __init__(self, JobNumber, whois=None, dig=None, hosts=None, extra_field=None):
        threading.Thread.__init__(self)
        self.JobNumber = JobNumber
        self.whois = whois
        self.dig = dig
        self.hosts = hosts
        self.extra_field = extra_field
    def run(self):
        gd = GraphDB_Driver(self.JobNumber,self.extra_field)
        gd.StoreRoot()
        gd.StoreURLs()
        gd.externalStoreWappalyzer()
        gd.StoreFiles()
        if self.whois: gd.StoreWhois(self.whois)
        if self.dig: gd.StoreDig(self.dig)
        if self.hosts: gd.StoreHosts(self.hosts)

Whenever a task finished I create an instance of this class and give it the job_number to submit this job to the Graph database

Inside each of the store function I do something similar to the following

n = Node("LABEL", key1=val1,key2=val2)
self.g.create(n)

r = Relationship(src, "HAS", n)
self.g.create_unique(r)

So I simply interact with the graphdb using create() to create the node and create_unique to create the relation

I.el-sayed
  • 325
  • 1
  • 5
  • 18
  • Some more details and your class would help. – Martin Preusse Jun 11 '15 at 11:51
  • that code doesn't help to understand the error. – Martin Preusse Jun 11 '15 at 12:01
  • @MartinPreusse How about now ? – I.el-sayed Jun 11 '15 at 12:55
  • What generates the JobNumber? If JobNumber is not unique, and one of your threads attempts to create a new node and relationship to it when another thread has already created a node + relationship, that would throw an error for the relationship creation as you're using `create_unique` (node creation would not throw an error as there are no uniqueness constraints). I have to say though that I don't understand what your error means by `relationship/ in ignored`. – RedCraig Jun 02 '16 at 10:52

1 Answers1

0

Not to digress, but if you are really concerned about neo4j performance, I suggest you to use raw cypher queries or gremlin. Moreover, I recommend using neo4j HTTP Rest Endpoint, since there are lot of issues in py2neo

hspandher
  • 15,934
  • 2
  • 32
  • 45