I tried to use the steps described here https://docs.yugabyte.com/v1.1/manage/data-migration/cassandra/bulk-export/
wget https://github.com/YugaByte/cassandra-loader/releases/download/v0.0.27-yb-2/cassandra-loader
wget https://github.com/YugaByte/cassandra-loader/releases/download/v0.0.27-yb-2/cassandra-unloader
chmod a+x cassandra-unloader
chmod a+x cassandra-loader
Since above tools are JVM based, installed open jdk
sudo yum install java-1.8.0-openjdk
Then exported the rows using:
% cd /home/yugabyte/entity
% ./cassandra-unloader -schema "my_ksp.my_table(id,type,details)" -host <tserver-ip> -f export.csv -numThreads 3
Total rows retrieved: 10000
Here details
is a JSONB column. Next, I create a new table my_table_new
in the same cluster, and try to load this data into
./cassandra-loader -schema "my_ksp.my_table_new(id,type,details)" -host <tserver-ip> -f /home/yugabyte/entity -numThreads 3 -progressRate 200000 -numFutures 256 -rate 5000 -queryTimeout 65
But get errors of the form:
Row has different number of fields (12) than expected (3)
It looks like the default delimiter “,” in the CSV file is causing the issue, since the JSONB data in the CSV file also has commas.
As an alternative tried passing -delim “\t”
to cassandra-unloader-- but that seems to insert two characters “\” and “t” and not the single-tab character. Is that expected?