1

I am migrating data from EC2 Cassandra Nodes to DataStax Astra (Premium Account) using DSBulk utility.

Command used: dsbulk load -url folder_created_during_unload -header true -k keyspace -t table -b "secure-connect-file.zip" -u username -p password

This command gives error after a few seconds. On checking the documentation, i found that i can add --executor.maxPerSecond in this command to limit the loading.

After this, the load command executed without any error. But if i enter a value over 15,000, the load command starts giving the error again.

enter image description here

Now, if a table has over 100M entries and 15,000 entries are migrated every second, it would hours and hours to complete the migration of one table. The complete database would take several days to migrate.

I want to understand what is causing this error and if there is a way to load the data at a higher speed.

Erick Ramirez
  • 13,964
  • 1
  • 18
  • 23
Vitul Goyal
  • 611
  • 7
  • 19
  • Hello Vitul! I've sent this question on to our folks who spend a lot of time working with DSBulk. We'll get back to you with an answer as soon as we can. – Aaron Sep 29 '21 at 15:50
  • In the meantime, feel free to connect with me on LinkedIn. https://www.linkedin.com/in/aaronploetz/ – Aaron Sep 29 '21 at 15:51

2 Answers2

2

What's happening here, is that DSBulk is running into the rate limit on the database. At the moment, it looks like the only way to increase that rate limit is to submit a ticket to support.

To submit a ticket, look for the "Other Resources" section of the Astra Dashboard's left nav. Click "Get Support" on the bottom.

Get Support is in the lower left corner of the page.

When the "Help Center" pops up, click "Create Request" in the lower right corner.

Create Request is in the lower right corner of the Help Center.

On the next page, click the green/cyan "Submit a Ticket" button in the upper right corner. Describe the problem you're having (rate limit) along with what DSBulk outputs when set for more than 15k/sec.

enter image description here

Aaron
  • 55,518
  • 11
  • 116
  • 132
1

To add to Aaron's response, you are hitting the default limit of 4K operations per second on your Astra DB.

We contacted you directly last week when we detected that you were hitting the limit but haven't heard back. I've reached out to you directly again today to let you know that I've logged a request on your behalf to increase the limit on your DB. Cheers!

Erick Ramirez
  • 13,964
  • 1
  • 18
  • 23
  • Hey Erick! Thank you for the update. I received your message last week but that time i was testing it from a free personal account. I have now created a production account for migration – Vitul Goyal Sep 30 '21 at 03:29
  • @VitulGoyal I've reached out to you on the other account. Let's discuss there. Cheers! – Erick Ramirez Sep 30 '21 at 03:45