4

I am trying to create a Watson Visual Recognition Create Classifier using v3 of the rest API following the documentation https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/visual-recognition/customizing.shtml#goodclassifying which states:

There are size limitations for training calls and data: The service accepts a maximum of 10,000 images or 100 MB per .zip file The service requires a minimum of 10 images per .zip file. The service accepts a maximum of 256 MB per training call.

However, using a "positive" zip file of 48MB containing 594 images (max size of an image is 144Kb) and a "negative" zip file of 16MB containing 218 images (max size of an image is 114Kb) but I keep getting the error:

<html>
<head><title>413 Request Entity Too Large</title></head>
<body bgcolor="white">
<center><h1>413 Request Entity Too Large</h1></center>
<hr><center>nginx</center>
</body>
</html>

In response to:

curl -X POST -F "good_positive_examples=@positive.zip" 
-F "negative_examples=@negative.zip" 
-F "name=myclassifier" 
-H "X-Watson-Learning-Opt-Out=true" 
"https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classifiers?api_key=<mykey>&version=2016-05-20"

I've kept trying reducing the file size by deleting images within the zips and re-trying but I'm well below the stated limits.

Anyone got any idea?

Thanks

Radiodef
  • 37,180
  • 14
  • 90
  • 125
twiz911
  • 634
  • 1
  • 9
  • 18
  • Have you tried increasing [`client_max_body_size`](http://nginx.org/en/docs/http/ngx_http_core_module.html#client_max_body_size)? – Richard Smith Jun 13 '16 at 11:14
  • I believe the service had some hiccups yesterday. Can you try it again today and let me know if you're still getting the same issue? – Nathan Friedly Jun 14 '16 at 18:30
  • 1
    Fixed itself today. Interesting I previously checked the service status page and it all said a-ok. – twiz911 Jun 14 '16 at 23:46

2 Answers2

2

This (413 Entity Too Large) error is intermittent when submitting jobs for training classifiers. I have written a script to process a directory structure of images as classes for training, including both a training (51%) and a test (49%) set. As the API restricts payload sizes to 100MB per ZIP file, I zipsplit(1) the class ZIP files into batches. When submitting those batches I receive this error, but discard and retry; invariably after 2-3 attempts, the API call succeeds.

I would guess that your in-bound connection manager is counting bytes including re-transmissions over the socket and not reporting actual payload size.

I recommend splitting ZIPs into sizes of <95 MB in order to avoid this complication in submitting images to the training API.

The code is in the age-at-home project under dcmartin on github.com; the training script is in bin/train_vr and testing script is in bin/test_vr. Your mileage may vary.

DC Martin
  • 41
  • 6
0

I just tried with 2 zip files (~45 MB each) and it works. I think it was a temporary problem in the nginx server. The requests to Visual Recognition go to nginx before going to the actual service.

German Attanasio
  • 22,217
  • 7
  • 47
  • 63