How does one return the pixel coordinates (or position) of a label in Google Vision API's label detection in Java?

Question

I can run label detection on an image using the Vision API. However, I want to know the coordinates of where that label was detected. For example, if a circle is detected somewhere in the image, how do I return the center of that circle in the Vision API?

The following is what was returned when I ran label detection on a local image of an ellipse. What is returned does not include x,y coordinates of the detected objects like text detection has:

google.cloud.vision.v1.EntityAnnotation.mid : /m/01vkl
google.cloud.vision.v1.EntityAnnotation.description : Circle
google.cloud.vision.v1.EntityAnnotation.score : 0.8125786
google.cloud.vision.v1.EntityAnnotation.topicality : 0.8125786
google.cloud.vision.v1.EntityAnnotation.mid : /m/03scnj
google.cloud.vision.v1.EntityAnnotation.description : Line
google.cloud.vision.v1.EntityAnnotation.score : 0.7547606
google.cloud.vision.v1.EntityAnnotation.topicality : 0.7547606
google.cloud.vision.v1.EntityAnnotation.mid : /m/03g09t
google.cloud.vision.v1.EntityAnnotation.description : Clip art
google.cloud.vision.v1.EntityAnnotation.score : 0.68722004
google.cloud.vision.v1.EntityAnnotation.topicality : 0.68722004
google.cloud.vision.v1.EntityAnnotation.mid : /m/06g58b
google.cloud.vision.v1.EntityAnnotation.description : Oval
google.cloud.vision.v1.EntityAnnotation.score : 0.60591185
google.cloud.vision.v1.EntityAnnotation.topicality : 0.60591185

score 0 · Answer 1 · answered Jan 16 '20 at 08:29

Giving the coordinates of a certain label it's not an option to consider because, as documented here, labels can identify general objects, locations, activities, etc.
This means that the labels are not specifically set into a certain position, they are obtained from the whole context of the image.

However, there are some labels that correspond to certain objects (for example, using this image, you will see that there is the label 'cat' and also the object 'cat', you can test it using the API Explorer). So, you can run Object and Label detection over the same image and merge the results seeking for labels that correspond to objects.

How does one return the pixel coordinates (or position) of a label in Google Vision API's label detection in Java?

1 Answers1