1

I have unloaded more than 100 CSV files in a folder. When I try to load those files to cassandra using DSBULK load and specifying the the folder location of all these files, I get the below error

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "unVocity-parsers input reading thread"

I wanted to see if anyone else has faced it and how it has been resolved.

Rajib Deb
  • 1,496
  • 11
  • 30

1 Answers1

2

Here are a few things you can try:

  1. You can pass any JVM option or system property to the dsbulk executable using the DSBULK_JAVA_OPTS env var. See this page for more. Set the allocated memory to a higher value if possible.
  2. You can throttle dsbulk using the -maxConcurrentQueries option. Start with -maxConcurrentQueries 1; then raise the value to get the best throughput possible without hitting the OOM error. More on this here.
adutra
  • 4,231
  • 1
  • 20
  • 18