Copy-activity from Blob Storage-Account to CosmosDb is very slow

Question

Situation:

I'm using the copy-activity from azure-data-factory to copy one json-file with 500 MB from a storage-account-blob to CosmosDB and from CosmosDb to a storage-Account-blob

The AzureBlobStorageLinkedService is configured with a SAS-Token.

Times:

CosmosDb to a storage-Account-blob: 4 minutes

Storage-account-blob to CosmosDB: 2 hours - over 7 hours (timeout)

CosmosDB:

Before copy-activity will be started, an empty collection with 20.000 RU/s will be created. I looked at the metrics of CosmosDB and it is really bored. There are only a few 429 errors. We have "default indexing-configuration" and a partitionKey. This means that we have data with several partitionKeys from several partitionKey-ranges (partitions)

Data:

In the json-file there are 48.000 json-objects. Some are small and some can have 200 KB.

Tries:

I tried with different WriteBatchSizes:

5: 2 hours

100: 2 hours

10.000: 7 hours (timeout)

I tried it with same/different regions => no difference

I tried it with smaller files => they are much faster (500 KB/s instead of 50 KB/s)

Question:

Why it is so slowly? Is the file with 500 MB too large?

There's really not enough detail to diagnose this. For instance: how did you partition your data? The RU that you allocate is divided up across the underlying physical partitions; if all of your data in a single blob is in a single logical partition (which then maps to a single physical partition), you are only using a fraction of your 20K RU (my guess is around 4K available, since you likely have a default of 5 physical partitions). Also: are you indexing all properties? If so, you'll burn more RU on writes than if you have a custom index policy. Please edit to clarify. But... a very broad q. — David Makogon, Dec 14 '18 at 16:34
@MaviDomates - There's no correlation of performance related to resource groups, since resource groups are merely a logical construct for grouping, permissions, etc. And location of resource groups have nothing to do with the region of the services themselves. — David Makogon, Dec 14 '18 at 16:35
You might want to look at your Cosmos DB metrics to see if there was any throttling taking place during the data move. — David Makogon, Dec 14 '18 at 16:38
I tried it with same/different regions => no difference. CosmosDB: Yes I looked at the metrics of CosmosDB and it is really bored. There are only a few 429 errors. We have "default indexing-configuration" and a partitionKey. This means that we have data with several partitionKeys from several partitionKey-ranges (partitions) — Mike V., Dec 17 '18 at 08:38

score 0 · Accepted Answer · answered Dec 17 '18 at 13:13

0

I tried with very high throughput-values and it worked fine:

1.000.000 RU/s: 9 Minuten ✔
100.000 RU/s: 15 Minuten ✔

But I have to think on scaling down after data-transfer ist complete, because of costs!!!

answered Dec 17 '18 at 13:13

Mike V.

63
4