2

I have a CSV file that has columns that contain arrays. The file looks like this:

siblings
[3010,3011,3012]
[2010,2012]

What I am trying to do is to return the elements of the arrays as separate rows.

I looked through the documentation and found UNWIND function. When I tried the example shown in the documentation:

UNWIND [1, 2, 3] AS x
RETURN x

everything worked fine. When I tried it with my data the query looked like this:

LOAD CSV WITH HEADERS FROM
'file:///test.csv' AS line1
UNWIND line1.siblings as a
RETURN a

and the result was

[3010,3011,3012]
[2010,2012]

instead of:

3010
3011
3012
...

Does anyone know what I am doing wrong?

Dave Bennett
  • 10,996
  • 3
  • 30
  • 41
Porjaz
  • 771
  • 1
  • 8
  • 28

2 Answers2

2

The CSV columns are all handled as strings so you need to treat them as such. In your case you can remove the brackets from the siblings column prior to splitting the string into a collection. That step would help you to avoid pre-processing the file. Once you split the string then you still have a collection of strings.

WITH "[1001,1002,1003]" as numbers_with_brackets_in_a_string
WITH substring( numbers_with_brackets_in_a_string, 1, size(numbers_with_brackets_in_a_string)-2 ) as numbers_in_a_string
UNWIND split( numbers_in_a_string, ',' ) as number_as_a_string
RETURN number_as_a_string
Dave Bennett
  • 10,996
  • 3
  • 30
  • 41
0

I have found an ugly way of fixing this.

I did the following thing:

LOAD CSV WITH HEADERS FROM
'file:///test.csv' AS line1
WITH line1, split(line1.siblings, ",") as s
UNWIND s as sib
RETURN sib

after which I got the following thing:

[3010
3011
3012]
...

I removed [] from the CSV file and I got the output that I desired.

I know this is an ugly solutions and I would appreciate if someone could find a better one.

Porjaz
  • 771
  • 1
  • 8
  • 28