1

I'm trying to load a graph with two nodes (Autor,Paper) and a relation with the import tool, right now I have this two files, which, as far as I understand, they must be:

authors.csv: :Author(Autor) :Adscription(Autor) :PMID(Paper) Author1 Department of Hematology. 31207293

Papers.csv :PMID(Paper) :PaperName(Paper) :AuthorList(Autor)
31207293 A huge paper name Author1,Author2,

These files are stored in /var/lib/neo4j/import

With this in mind, I run the following code

sudo neo4j-admin import --database=graph.db --id-type=STRING --mode=csv --delimiter="  " --nodes :Autor:Paper="authors.csv,Papers.csv"

but I got

WARNING: Max 1024 open files allowed, minimum of 40000 recommended. See the Neo4j manual.
Expected '--nodes' to have at least 1 valid item, but had 0 []
usage: neo4j-admin import [--mode=csv] [--database=<name>]
                      [--additional-config=<config-file-path>]
                      [--report-file=<filename>]
                      [--nodes[:Label1:Label2]=<"file1,file2,...">]
                      [--relationships[:RELATIONSHIP_TYPE]=<"file1,file2,...">]

Right now, I'm only attempting to load the nodes Paper and Author, I'm able to do this in the browser by means of

USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM   "file:///authors.csv" AS row
MERGE ( c:Autor{ Name:row.Autor  , Adscription: row.Adscription, PMID=row.PMID } )

but the time taken by doing so, is long.

1 Answers1

0

This warning is probably not affecting you, but see here for more info.

If you are importing large amounts of data, the reason why your Cypher is taking so long is because of MERGE. If you know that the authors.csv contains a unique entry for each author, then you do not need to do a MERGE since it will never match to an existing node.

Try switching MERGE to CREATE. It should be much faster.

Joey Kilpatrick
  • 1,394
  • 8
  • 20