0

I was able to identify squares from a images using VNDetectRectanglesRequest. Now I want those rectangles to store as separate images (UIImage or cgImage). Below is what I tried.

        let rectanglesDetection = VNDetectRectanglesRequest { request, error in
            rectangles = request.results as! [VNRectangleObservation]
            rectangles.sort{$0.boundingBox.origin.y > $1.boundingBox.origin.y}
            
            for rectangle in rectangles {

                let rect = rectangle.boundingBox
                let imageRef = cgImage.cropping(to: rect)
                let image = UIImage(cgImage: imageRef!, scale: image!.scale, orientation: image!.imageOrientation)

                checkBoxImages.append(image)
            }

Can anybody point out what's wrong or what should be the best approach?


Update 1

At this stage, I'm testing with an image that I added to the assets.

image that I'm testing

With this image I get 7 rectangles as observations as each for each cell and one for the table margin.

My task is to identify the text inside in each rectangle and my approach is to send VNRecognizeTextRequest for each rectangle that has been identified. My real scenario is little complicated than this but I want to at least achieve this before going forward.


Update 2

            for rectangle in rectangles {

                let trueX = rectangle.boundingBox.minX * image!.size.width
                let trueY = rectangle.boundingBox.minY * image!.size.height
                let width = rectangle.boundingBox.width * image!.size.width
                let height = rectangle.boundingBox.height * image!.size.height
                print("x = " , trueX , " y = " , trueY , " width = " , width , " height = " , height)
                
                let cropZone = CGRect(x: trueX, y: trueY, width: width, height: height)
                
                guard let cutImageRef: CGImage = image?.cgImage?.cropping(to:cropZone)
                else {
                    return
                }

                let croppedImage: UIImage = UIImage(cgImage: cutImageRef)
                croppedImages.append(croppedImage)
            }

My image width and height is

width = 406.0 height = 368.0

I've taken my debug interface for you to get a proper understand.

debug interface

As @Lasse mentioned, this is my actual issue with screenshots.

AnujAroshA
  • 4,623
  • 8
  • 56
  • 99
  • Would be helpful if you described what the problem with your code is. Also, where does the `cgImage` come from that you're cropping? – Lasse Jan 12 '23 at 19:05

1 Answers1

0

This is just a guess since you didn't state what the actual problem is, but probably you're getting a zero-sized image for each VNRectangleObservation.

The reason is: Vision uses a normalized coordinate space from 0.0 to 1.0 with lower left origin.

So in order to get the correct rectangle of your original image, you need to convert the rect from Normalized Space to Image Space. Luckily there is VNImageRectForNormalizedRect(::_:) to do just that.

Lasse
  • 490
  • 3
  • 8