How to get the position/ coordinates from a detected object with the TensorFlow Lite Interpreter?

Question

I am building an app for object detection with a camera api (camerax) and a self-trained tflite model.

For the integration of tflite I have tried two approaches. First with the ML Kit and second with the TensorFlow Lite inference. I like the second approach better but there I have the problem to get the coordinates for the detection boxes of the detected objects.

As a basis I used the code from: https://github.com/soum-io/TensorFlowLiteInceptionTutorial/blob/master/app/src/main/java/com/soumio/inceptiontutorial/Classify.java.

import org.tensorflow.lite.Interpreter;
.
.
.
private final Interpreter.Options tfliteOptions = new Interpreter.Options();
private Interpreter tflite;

private byte[] [] labelProbArray = null;
private int[] intValues;
private int DIM_IMG_SIZE_X = 224;
private int DIM_IMG_SIZE_Y = 224;
private int DIM_PIXEL_SIZE = 3;
.
.
.
choosenModel = "detect_224_quant.tflite";
choosenLabel = "labelmap.txt";

intValues = new int[DIM_IMG_SIZE_X * DIM_IMG_SIZE_Y];
tflite = new Interpreter(loadModelFile(), tfliteOptions);
labelList = loadLabelList();

imgData = ByteBuffer.allocateDirect(DIM_IMG_SIZE_X * DIM_IMG_SIZE_Y * DIM_PIXEL_SIZE);
imgData.order(ByteOrder.nativeOrder());

labelProbArray = new byte[1][labelList.size()];

Bitmap bitmap_orig = toBitmap(image);
Bitmap bitmap = getResizedBitmap(bitmap_orig, DIM_IMG_SIZE_X, DIM_IMG_SIZE_Y);
convertBitmapToByteBuffer(bitmap); //create imgData

tflite.run(imgData, labelProbArray);

printLabels();

Output for printLabels

private void printLabels() {
    for (int i = 0; i < labelList.size(); i++){
       
            sortedLabels.add(
                    new AbstractMap.SimpleEntry<>(labelList.get(i), (labelProbArray[0][i] & 0xff) / 255.0f));
        
        if (sortedLabels.size() > RESULTS_TO_SHOW) {
            sortedLabels.poll();
        }
    }
    final int size = sortedLabels.size();
    for (int i = 0; i < size; i++){
        Map.Entry<String, Float> label = sortedLabels.poll();
        topLabels[i] = label.getKey();
        topConfidence[i] = String.format("%.0f%%", label.getValue()*100);

    }

So I guess I'll just have to read the position/ coordinates from the detected object out of the output tensor (labelProbArray) somehow. But how? I hope somebody can help!

Does this code compile and give results? If you print the labelProbArray what does it give you? — Farmaker, Jul 13 '20 at 17:23
I have added the output from printlabel to the code. Yes there is an output from where I get the labels and the probabilities. — Felix, Jul 13 '20 at 17:31
Felix have u checked the official demo from Tensorflow for object detection? Look [this](https://github.com/tensorflow/examples/blob/master/lite/examples/object_detection/android/app/src/main/java/org/tensorflow/lite/examples/detection/tflite/TFLiteObjectDetectionAPIModel.java) to see how interpreter outputs all sort of information. From your code I do not understand how u can get all that! — Farmaker, Jul 13 '20 at 17:39
So far I see that they use tflite.runForMultipleInputsOutputs() instead of just tflite.run(). And they have an outputMap with the location. Maybe this is the solution.. I'll check this out and update my post. Thank you. — Felix, Jul 13 '20 at 17:49
@Farmaker This was a part of the solution. The other side was that my code was designed for a different shape of tensor. In the example model what i used there was no info to coordinate at all... — Felix, Jul 14 '20 at 09:41
@Felix did that solution work? If yes, could you add an accepted answer? — Meghna Natraj, Aug 08 '20 at 06:35

How to get the position/ coordinates from a detected object with the TensorFlow Lite Interpreter?

0 Answers0