I have a running AWS Neptune graphDB which is being used in a production environment. I have since identified new vertices that I would like to add that will connect to specific existing vertices in the DB.
I have added the original set by splitting it up with the 'csv-to-neptune-bulk-format' script in https://github.com/awslabs/amazon-neptune-tools/tree/master/csv-to-neptune-bulk-format .
My question is, how can I bulk load my additional set in the most efficient way? I have two ideas on how to appraoch this, but I'm hoping that someone knows a simpler way.
Approach 1 will be to use the above 'csv-to-neptune-bulk-format' script to split up the new additional set and then bulk load that. I will then have duplicate vertices of where the new set overlaps with the original as the above script will assign new vertex id's for the vertices where the new set will connect to the original set. I have a function to then merge these duplicate vertices. This approach can be quite resource intensive though.
Approach 2 will be to split up the additional set with the above script and then replace the connecting vertex's id's in the generated csv for the edges that will connect the original set with the additional set. So basically the edge csv will change from [~id,~label,~from,~to] to [~id,~label, complimenting vertex id's generated from the first bulkupload,~to].
I'm hoping that I've missed some documentation or logic somewhere that will allow me to use existing vertex id's to simply bulk load the new processed vertices csv and the edge csv that will connect the new vertices with original vertices. Any help or advice will be greatly appreciated.