0

I'm using pydocumentdb to upload some processed data to CosmosDB as a Document on Azure Cloud with a Python script. The files are coming from the same source. The ingestion works well with some files, but gives the next error for the files that are greater than 1000 KB:

 pydocumentdb.errors.HTTPFailure: Status code: 413
 "code":"RequestEntityTooLarge","message":"Message: {\"Errors\":[\"Request 
 size is too large\"]

I'm using SQL API and this is how I create the document inside a Collection:

client = document_client.DocumentClient(uri, {'masterKey': cosmos_key})
... I get the Db link and Collection link ...
Client.CreateDocument(collection_link, data)

How can I solve this error?

Marc
  • 17
  • 4
  • How do you define "small files" and "larger files"? Are you saying everything works, as written, for a document under a specific size? Please edit your question to provide more details. – David Makogon May 30 '18 at 15:40
  • Hi @DavidMakogon, thank you for comments, I've changed the post with some more accurate details. The files that are around 500 kb are uploaded correctly, but the others around 1000kb give me a "Request size is too large" error. – Marc May 31 '18 at 15:21
  • Hi, question is if this error is really related to the pydocumentdb lib or you just hit the document DB size limit of 2MB per document? – Hauke Mallow Jun 01 '18 at 08:08
  • Hi @HaukeMallow, I'm new at azure platform, so I don't know all the details. As far as I know I can't limit or expand the document size. The question is if pydocumentdb or CosmosDB has a preconfigured size limit? (I supposed that the answer is no, because it's supposed to work with higher sizes than 1MB). Thank you! – Marc Jun 01 '18 at 12:45
  • Hi Marc, Cosmos DB has a fix size limit per document of 2 MB (https://learn.microsoft.com/en-us/azure/cosmos-db/sql-api-resources). My guess is that you are hitting this limit. – Hauke Mallow Jun 01 '18 at 12:53
  • I see, you might be right. In this case the only solution is to divide the file in to smaller documents, you agree? @HaukeMallow. Thanks for the help! – Marc Jun 02 '18 at 16:51

1 Answers1

0

Per my experience, to store some large size data or files on Azure Cosmos DB, the best practice is to upload data to Azure Blob Storage or other external storages and create an attachment with its references or associated metadata in a document on Azure Cosmos DB.

You can refer to the REST API for Attachments to know it and achieve the feature of your needs using the methods of PyDocument API include CreateAttachment, ReplaceAttachment, QueryAttachments and so on.

Hope it helps.

Jay Gong
  • 23,163
  • 2
  • 27
  • 32