-1

I'm trying to detect items held in a hand using ML-Kit image labeling through a camera. If for example, I show it a soda can it picks up objects such as the hand, face, background etc... Things I'm not interested in and then doesn't find the object in the hand even at a .25 min accuracy using cloud vision.

Is there a way to limit what the vision looks for or another way to increase accuracy?

PS: I am also willing to switch APIs if there is something better for this task.

//This is mostly from a google tutorial 
private fun runCloudImageLabeling(bitmap: Bitmap) {
    //Create a FirebaseVisionImage
    val image = FirebaseVisionImage.fromBitmap(bitmap)

    val detector = FirebaseVision.getInstance().visionCloudLabelDetector

    //Use the detector to detect the labels inside the image
    detector.detectInImage(image)
            .addOnSuccessListener {
                // Task completed successfully
                progressBar.visibility = View.GONE
                itemAdapter.setList(it)
                sheetBehavior.setState(BottomSheetBehavior.STATE_EXPANDED)
            }
            .addOnFailureListener {
                // Task failed with an exception
                progressBar.visibility = View.GONE
                Toast.makeText(baseContext, "Sorry, something went wrong!", Toast.LENGTH_SHORT).show()
            }
}

The ability to detect what's in the hand at high accuracy.

Frank van Puffelen
  • 565,676
  • 79
  • 828
  • 807

2 Answers2

0

There is no setting that controls the accuracy in the built-in object detection model that Firebase ML Kit uses.

If you want more accurate detect, you have two options:

  1. Call out to Cloud Vision, the server-side API that can detect many more object categories, and typically with much higher accuracy. This is a paid API, but it does come with a free quota. This the comparison page in the documentation for details.

  2. Train your own model that is better equipped for the image types you care about. You can then use this custom model in your app to get better accuracy.

Frank van Puffelen
  • 565,676
  • 79
  • 828
  • 807
0

ML Kit provides the Object Detection & Tracking API that you can use to locate objects.

That API allows you to filter on the prominent object (close to center of viewfinder), which is the soda can in your example. The API returns the bounding box around the object, which you can use to crop and subsequently feed it through the Image Labeling API. This allows you to filter out all the non-relevant background and or other objects.

Chrisito
  • 494
  • 2
  • 4