CSV format for ML Vision

Question

Wrong number of CSV columns for AUTOML vision and no documentation on the correct format

I'm trying to use AutoML to train some images I've annotated. It complains that I have the wrong number of columns in the CSV I wrote (saying it should be 11 rather than 9). However, all the examples I see of the CSV are 9 columns. I feel like the format has changed and they didn't update the documentation. Thanks for your help in advance.

Details:

This is their example from their document:

[set,]image_path[,label,x1,y1,,,x2,y2]
TRAIN,gs://My_Bucket/sample1.jpg,cat,0.125,0.25,,,0.375,0.5
VALIDATE,gs://My_Bucket/sample1.jpg,cat,0.4,0.3,,,0.55,0.55
TEST,gs://My_Bucket/sample1.jpg,dog,0.5,0.675,,,0.75,0.875

Here is a line from my test data:

TRAIN,gs://mytensorflowdata/CanVideo 50.jpg,sodacan,0.378125,0.10138888888888889,,,0.61796875,0.8708333333333333

I also tried without the bounding boxes like:

TRAIN,gs://mytensorflowdata/CanVideo 50.jpg,sodacan

Here is the error message AutoML gives me:

Error: gs://mytensorflowdata/labels.csv line 13: Expected 11 columns, but found 3 columns only.

score 1 · Answer 1 · answered Oct 13 '19 at 19:32

I just ran into the same problem. You are right that they haven't updated the documentation everywhere. This page does show 11 columns though: https://cloud.google.com/vision/automl/object-detection/docs/csv-format. It seems like they added the option to provide all 4 corners of the bounding box. The new example is:

TRAIN,gs://folder/image1.png,car,0.1,0.1,,,0.3,0.3,,
TRAIN,gs://folder/image1.png,bike,.7,.6,,,.8,.9,,
UNASSIGNED,gs://folder/im2.png,car,0.1,0.1,0.2,0.1,0.2,0.3,0.1,0.3
TEST,gs://folder/im3.png,,,,,,,,,

So you need to add two empty columns at the end of your document, like so:

TRAIN,gs://mytensorflowdata/CanVideo 50.jpg,sodacan,0.378125,0.10138888888888889,,,0.61796875,0.8708333333333333,,

CSV format for ML Vision

1 Answers1