0

I am training Azure Custom Text Classification model. Training set of 500k text documents are uploaded to Blob storage, so I thing left is to use REST-API to create a training project.

Issue that I am facing is that in API for project creation payload is limited to 10mb. My training set would requite a payload of about 80mb.

This would be fine if I could create a project and then append labeled documents to it in multiple batches, but from what I am able to see, in Custom Text Classification API the only way to add this data is to do it once during project creation or update afterwards, overwriting initially uploaded data. This means available training dataset for this service is hard-limited to whatever I can fit in 10mb payload.

Does this make sense? I'd imagine there needs to be a way to add more data labels than fits into 10mb payload to a project?

PS: I tried to upload json file to blob and create project this way, but it looks like this approach uses same API and is limited to the same 10mb payload restriction. I also tried to create a project and then substitute the project json in blob, but it then fails, complaining that file was manually changed.

0 Answers0