4

I use COPY command to take a copy of data. COPY looks more simple than sstables. But it looks like it can't import empty string. Columns which are empty in original table are null in imported. Steps to reproduce below.

CREATE TABLE empty_example (id bigint PRIMARY KEY, empty_column text, null_column text);
INSERT INTO empty_example (id, empty_column) VALUES ( 1, '');
SELECT * from empty_example ;
 id | empty_column | null_column
----+--------------+-------------
  1 |              |        null
COPY empty_example TO 'empty_example.csv';
TRUNCATE empty_example ;
COPY empty_example FROM  'empty_example.csv';
SELECT * from empty_example ;
 id | empty_column | null_column
----+--------------+-------------
  1 |         null |        null

I tried to play with WITH options but couldn't solve the issue. Is it possible to preserve null/empty string distinction with COPY?

kharus
  • 41
  • 1
  • 3

1 Answers1

3

Which version of Cassandra are you using ? Since Cassandra 3.4, COPY commands has a bunch of options to handle empty or null strings:

cqlsh:system_schema> help COPY

        COPY [cqlsh only]

          COPY x FROM: Imports CSV data into a Cassandra table
          COPY x TO: Exports data from a Cassandra table in CSV format.

        COPY <table_name> [ ( column [, ...] ) ]
             FROM ( '<file_pattern_1, file_pattern_2, ... file_pattern_n>' | STDIN )
             [ WITH <option>='value' [AND ...] ];

        File patterns are either file names or valid python glob expressions, e.g. *.csv or folder/*.csv.

        COPY <table_name> [ ( column [, ...] ) ]
             TO ( '<filename>' | STDOUT )
             [ WITH <option>='value' [AND ...] ];

        Available common COPY options and defaults:

          DELIMITER=','           - character that appears between records
          QUOTE='"'               - quoting character to be used to quote fields
          ESCAPE='\'              - character to appear before the QUOTE char when quoted
          HEADER=false            - whether to ignore the first line
          NULL=''                 - string that represents a null value

As you can see, by default the option NULL='' means that empty string is treated as null value. To change this behavior, set NULL='null' or whatever character you want for null value ...

doanduyhai
  • 8,712
  • 27
  • 26