1

I'm trying to load a large CSV (30 GB) file into my cluster. I'm realizing that I might be overloading my Cassandra driver which is causing it to crash at some point during loading. I am getting a repeated message while it loads the data, until a certain point where it stops and I get an error that stops the process.

enter image description here

My current loading command is: dsbulk load -url data.csv -k hotels -t reviews -delim '|' -header true -h '' -port 9042 -maxConcurrentQueries 128

Using -maxConcurrentQueries 128 did not change anything in terms of errors.

Any idea how I can modify my command to make it work?

Epsilon_Delta
  • 135
  • 1
  • 6

0 Answers0