0

I was trying to do predictions on gcloud ml-engine with the tensorflow object detection pets example, but it doesn't work.

I created a checkpoint using this example: https://github.com/tensorflow/models/blob/master/object_detection/g3doc/running_pets.md

With the help of the tensorflow team, I was able to create an saved_model to upload to the gcloud ml-engine: https://github.com/tensorflow/models/issues/1811

Now, I can upload the model to the gcloud ml-engine. But unfortunately, I'm not able to do correct prediction requests to the model. Everytime I try a prediction, I get the same error:

Input instances are not in JSON format.

I was trying to do online predictions with

gcloud ml-engine predict --model od_test --version v1 --json-instances prediction_test.json

and I was trying to do batch predictions with

gcloud ml-engine jobs submit prediction "prediction7" 
    --model od_test 
    --version v1 
    --data-format TEXT 
    --input-paths gs://ml_engine_test1/prediction_test.json 
    --output-path gs://ml_engine_test1/prediction_output 
    --region europe-west1

I want to submit a list of images as unit8-matrices, so for the export I was using the input type image_tensor.

As stated in the documentation here: https://cloud.google.com/ml-engine/docs/concepts/prediction-overview#prediction_input_data, the input json should have a particular format. But nether the format for online predictions, nor the format for batch predictions is working. My latest tests were a single file with the content:

{"instances": [{"values": [1, 2, 3, 4], "key": 1}]}

and the content:

{"images": [0.0, 0.3, 0.1], "key": 3}
{"images": [0.0, 0.7, 0.1], "key": 2}

none of them were working. Can anyone help me, how the input format should be?

edit

The error from the batch processing is

{
    insertId:  "1a26yhdg2wpxvg6"   
    jsonPayload: {
        @type:  "type.googleapis.com/google.cloud.ml.api.v1beta1.PredictionLogEntry"    
        error_detail: {
            detail:  "No JSON object could be decoded"     
            input_snippet:  "Input snippet is unavailable."     
        }
        message:  "No JSON object could be decoded"    
    }
    logName:  "projects/tensorflow-test-1-168615/logs/worker"   
    payload: {
        @type:  "type.googleapis.com/google.cloud.ml.api.v1beta1.PredictionLogEntry"    
        error_detail: {
            detail:  "No JSON object could be decoded"     
            input_snippet:  "Input snippet is unavailable."     
        }
        message:  "No JSON object could be decoded"    
    }
    receiveTimestamp:  "2017-07-28T12:31:23.377623911Z"   
    resource: {
        labels: {
            job_id:  "prediction10"     
            project_id:  "tensorflow-test-1-168615"     
            task_name:  ""     
        }
        type:  "ml_job"    
    }
    severity:  "ERROR"   
    timestamp:  "2017-07-28T12:31:23.377623911Z"   
}
  • The error message you are reporting seems to be coming from `gcloud ml-engine local predict`, can you confirm? If so, what is the error message returned by the service? – rhaertel80 Jul 28 '17 at 15:33
  • you are right, the error seems to come from gcloud, not the model. – Jörg Kiesewetter Jul 31 '17 at 08:41
  • That error message happens when `json.loads` raises a `ValueError`. Do you mind providing a copy to a link of our input file? – rhaertel80 Jul 31 '17 at 15:16
  • You mean the file which I use? yes, no problem: https://drive.google.com/file/d/0B-LzDebt2EYGSmZ5d3l0NHplUW8/view?usp=sharing The file was created with notepad++ on windows 10. At first it had the encoding ANSI and after my first tries I changed the encoding to UTF-8 – Jörg Kiesewetter Jul 31 '17 at 16:56
  • OK, so there's still something wrong with your file (has a few extra characters up front): `>>> open("prediction_test.json").read() '\xef\xbb\xbf{"instances": [{"inputs": [[[242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 23]]]}]}'` I verified that removing those characters allows the json to be parsable – rhaertel80 Jul 31 '17 at 20:31

2 Answers2

2

I believe the particular model expects binary image data for prediction.

I'd expect your request would be something along the lines of:

{
  instances: [
    { "images": { "b64": "image-bytes-base64-encoded" }, "key": 1 },
    ...
  ]
}

Hope that helps toward a working solution. Let us know if it doesn't and I'll try to get you something more definitive.

Nikhil Kothari
  • 5,215
  • 2
  • 22
  • 28
  • 1
    I am using this format, but server keeps rejecting. I am using b64 conversion in C#. Any help ? Error " "{ "error": "Failed to process element: 0 key: image of \'instances\' list. Error: Invalid argument: JSON Value: \"b64\" Type: String is not of expected type: float" }" – PCG Feb 15 '19 at 03:40
1

The model your exported accepts the input like following for prediction if you use gcloud to submit your requests for gcloud ml-engine local predict as well as batch prediction.

{"inputs": [[[242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 23]]]}
{"inputs": [[[232, 242, 219], [242, 240, 239], [242, 240, 239], [242, 242, 239], [242, 240, 123]]]}
...

If you're sending the requests directly to the service (i.e., not using gcloud), the body of the request would look like:

{"instances": [{"inputs": [[[242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 23]]]}]}
{"instances": [{"inputs": [[[232, 242, 219], [242, 240, 239], [242, 240, 239], [242, 242, 239], [242, 240, 123]]]}]}

The input tensor name should be "inputs" because it is what we've specified in the signature.inputs.The value of each JSON object is a 3-D array as you can tell from here. The outer dimension is None to support batched-input. No "instances" is needed (unless you use the http API directly). Note that you cannot specify "key" in the input unless you modify the graph to include an extra placeholder and output it untouched using tf.identity.

Also as mentioned in the github issue, the online service may not work due to the large memory the model requires. We are working on that.

rhaertel80
  • 8,254
  • 1
  • 31
  • 47
yxshi
  • 244
  • 1
  • 5
  • unfortunately, this is still not working. The error is still `Input instances are not in JSON format. See "gcloud ml-engine predict --help" for details.`. The error is shown very fast, so it seems like the ml-engine is giving this error before the model is even loaded. – Jörg Kiesewetter Jul 31 '17 at 08:38
  • Do you have any hidden/invisible chars in the input files? Can you run json validator to ensure it is legitimate json? – yxshi Jul 31 '17 at 18:23
  • online json validators say that the file is correct – Jörg Kiesewetter Jul 31 '17 at 19:13
  • problem found: after checking the file with a hex editor, I discovered three characters in front of the file. Probably because of the conversion from ANSI to UTF-8, there were the three chars EF BB BF which were not shown in the editor. Thank you very very much for your help. – Jörg Kiesewetter Jul 31 '17 at 19:41
  • FWIW I would recommend exporting the model to use raw byte strings. uint8 in JSON is a fairly inefficient way to encode an image. – rhaertel80 Aug 02 '17 at 06:04
  • what do you mean: "The outer dimension is None to support batched-input" i'm getting a graph error expected shape (,1) but got (2,) or (1,) – Jason Dec 11 '17 at 20:58
  • fp = open('input.json', 'wb'); for numpy_image_object in objects: fp.write(json.dumps({'inputs': numpy_image_object.tolist()})+'\n'); fp.close() – Hassan Kamal Aug 03 '18 at 20:01