0

My question is a follow up of the question I asked here => [1]. After a long conversation with stephen mallette, he showed me how I can build a graph that will be loaded when I will start the server. My final script is this [2]. What I want to do? Let's say I have:

[
{
    "host": "google.com",
    "ip": "8.8.8.8",
    "random": 25
},
{
    "host": "google.com",
    "ip": "1.2.3.4",
    "random": 10
}
]

There will be a vertex with a property "host" with the value google.com (#1). There will be a vertex with the property "ip" and the value 8.8.8.8 (#2) and another one with the property "random" and the value 25 (#3). Also, I will create 3 edges. host #1 -> ip #2, host #1 -> random #3 and ip #2 -> random #3. For the other object, I won't create another google.com vertex, because it already exists, but I will create the ip #4 and the random #5 one. I will create a host #1 -> ip #4 edge, host #1 -> random #5 and ip #4 -> random #5. So for an object O with k fields, there will be k possibly new vertices and k * (k - 1) / 2 edges.

My question is...can my code be improved? I tried to work with a JSON with 10k objects, each with 7 fields, but it kind of takes time. How can I achieve this in a faster way? Can't I process batches of data? I heard about indexes, but I don't know what this means or how this can improve everything.

[1] Normal JSON to GraphSON format

[2] https://pastebin.com/g7qnQdq9

Edit: Ok, I hard-coded multiple graph.createIndex(X,Vertex.class) commands, where X = the name of the fields in my JSON. It seems to be faster, yes. How can I further improve it? What am I doing wrong and how can I actually do it better? Should I try to generate a JSON in the format gremlin exports a graph, instead of doing this? I think it is extremely hard to achive that format. I can't find proper documentation and I'm desperate to find an answer, since this is a job related problem.

Edit 2: By the way, I just tried this => https://pastebin.com/Uts4KQCH script with a 50k objects JSON and around the 38k, it kind of slowed down a lot, like from 1000 in 1.5 seconds, to 1000 in 30 seconds.

Adrian Pop
  • 1,879
  • 5
  • 28
  • 40

0 Answers0