0

I'm trying to import a large CSV file into Neo4j 3.x via the Cypher Shell (instead of the Neo4j - Shell) on a macOS dev machine.

The import statements are defined within a cypher script file.

I have set the correct import directory within the conf file as "conf-path for import CSV files".

When I pipe in the command as follows:

cat <path to the Cypher script file> | $NEO4J_HOME/bin/cypher-shell -u user -p password --format auto

the path to the Neo4j App is added upfront when the shell tries to access the CSV files. That leads to a concat path like this:

file:/<path to the Neo4j App/neo4j-community-3.x>/file:<path to the CSV file set within conf>/Import.CSV

and that throws a "Couldn't load the external resource at:..." error

The Cypher Script is loaded correctly because upfront CONSTRAINT commands are executed as intended. The Cypher scripts fails when it tries to access the CSV files with "LOAD_CSV".

Is there any additional setting I have to make to let the Cypher Shell know that it should not add the Neo4j App directory?

I tried to find this in the document without much luck.

Any help is greatly appreciated.

Thanks

Krid

Krid
  • 269
  • 1
  • 3
  • 14

2 Answers2

2

Out of the box, the only place that CSV can be imported from is the import folder.

Any LOAD CSV statements must be relative to this directory; i.e. LOAD CSV FROM "file:///mydata.csv" loads the file mydata.csv that is located in the import folder.

This is for good reason; protection of the filesystem is very important, and being able to import any arbitrary CSV file from any location widens the attack vector for malicious uploads and/or executions.

However, if you want to change the directory that is considered the import directory, that is completely possible.

You can change the directory that is considered to be the import folder.

This can be done by setting dbms.directories.import in the config to point to a different folder; i.e. setting it to /var/uploads would allow CSV to be loaded from that directory, or set it to something like /home/krid/my-neo-imports/ to use that directory. The LOAD CSV file URL will then be relative to that location instead.

If you want, you can even set up a directory hierarchy within the import directory so that you can organize your imported files. For instance, you could put a dataset1 and dataset2 directory in the import dir, and then put a file called members.csv in dataset1 and events.csv in dataset2. Then, you would load each file by doing LOAD CSV FROM "file:///dataset1/members.csv and LOAD CSV FROM "file:///dataset2/events.csv, respectively.

If you want to be able to upload from any location, set it to "/".

If you set dbms.directories.import to root ("/"), then any file on the entire filesystem can be imported. However, you will need to specify paths relative to your filesystem root when uploading.

Note that this option will make it easier for potential attackers to upload malicious import scripts.

Rebecca Nelson
  • 1,286
  • 2
  • 9
  • 20
  • Hello Rebecca, thanks for the help. This approach solved it. You're right with the security issue. But the problem with focusing on the /import directory is that it doesn't allow you to specify other drives for the import as well. I'll block it after the import :-) – Krid Apr 19 '18 at 12:44
  • @Krid no problem! I just wanted to be sure that the info is there. It's up to you to decide when and how to apply it, or even if it matters in your situation. – Rebecca Nelson Apr 19 '18 at 12:45
0

To simply load a CSV, you have to :

  • copy it under the import folder of Neo4j
  • in your cypher script the path file will be file:///MY_CSV_FILE.csv
logisima
  • 7,340
  • 1
  • 18
  • 31
  • Hello Logisima, thanks for the tip. That would be a solution but my problem is that I have up to 200 CSV files that have to be uploaded in different time slices. I would prefer to have them sitting in 'their own' folders and only control the uploading path via the setting in the conf file. In my understanding that should be possible, because I think that's the intention of this setting ;-) - or do you think uploading directories ALWAYS have to be a subfolder of the Neo4j App? – Krid Apr 19 '18 at 12:10
  • The import folder can be configured in the `neo4j.conf` if you want. And yes I prefer to limit the access to this folder for some security reason (otherwise you give access to all your FS by cypher :) ) – logisima Apr 19 '18 at 13:09
  • Can I copy a csv file (or whatever) to the `import` neo4j folder from inside cypher-shell? – roschach May 28 '19 at 16:12