3

I am trying to decode barcodes that appear on a region of interest, that is 80% of the screen width and 20% of the screen height and centered on both directions (blue rectangle).

enter image description here

The camera pixel buffer is rotated right.

This is what Apple has to say about this orientation:

The (x,y) pixel coordinates of the origin point (0,0) represent the top row and rightmost column, respectively. Pixel (x,y) positions increase top-to-bottom, right-to-left. If an image is encoded with this orientation, then displayed by software unaware of orientation metadata, the image appears to be rotated 90° counter-clockwise. (That is, to present the image in its intended orientation, you must rotate it 90° clockwise.)

So, when I define the region of interest of my VNDetectBarcodesRequest I do like this:

  lazy var barcodeRequest: VNDetectBarcodesRequest = {
    let barcodeRequest = VNDetectBarcodesRequest {[weak self] request, error in
      guard error == nil else {
        print ("ERROR")
        return
      }
      self?.classification(request)
    }

    barcodeRequest.regionOfInterest = CGRect(x: 0.1,
                                             y: 0.4,
                                             width: 0.9,
                                             height: 0.6)

If the bar code is inside the blue area and at any point above that, including anywhere on the area at the top of the blue area, it will detect. If the barcode is down the blue area, it will not detect anything.

Duck
  • 34,902
  • 47
  • 248
  • 470

2 Answers2

3

Just making sure, if you look at regionOfInterest, the documentation says:

The rectangle is normalized to the dimensions of the processed image. Its origin is specified relative to the image's lower-left corner.

So the origin (0,0) is at the bottom left. With your current CGRect,

CGRect(x: 0.1,
       y: 0.4,
       width: 0.9,
       height: 0.6)

you are getting the expected result - "If the bar code is inside the blue area and at any point above that, including anywhere on the area at the top of the blue area, it will detect."

All you need to do is change the height from 0.6 to 0.2. You will want:

barcodeRequest.regionOfInterest = CGRect(x: 0.1,
                                         y: 0.4,
                                         width: 0.9,
                                         height: 0.2) /// your height is wrong
aheze
  • 24,434
  • 8
  • 68
  • 125
  • I am accepting your answer and upvoting you because it worked butI do not understand that. I am not seeing how it can be 0.2! The image coming from the camera is oriented right, so its 0,0 coordinate is at top right. – Duck Feb 02 '21 at 09:49
  • I do not understand this normalized coordinate system. – Duck Feb 02 '21 at 10:19
  • @Duck but you make the request handler something like `imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: .right)`, right? That tells Vision that the orientation of the image is right, so it will automatically rotate it for you. This "processed image," that is rotated to upright, is what `regionOfInterest` applies to. – aheze Feb 02 '21 at 16:50
  • Then `regionOfInterest` is a CGRect. But unlike most UIKit rects, the origin is in the bottom left. – aheze Feb 02 '21 at 16:52
  • ok. but it is not just a rect and I don't get it. – Duck Feb 02 '21 at 17:55
  • @Duck so normalized means it's relative to the entire length. So let's say the "processed image" is 500 width x 1000 height. Then, the regionOfInterest `CGRect(x: 0.2, y: 0.2, width: 0.5, height: 0.5)` would convert to an actual rectangle of `origin: (100, 200) size: (250, 500)`. This rectangle still has the (0,0) at the bottom left. – aheze Feb 02 '21 at 18:38
  • According to this, both your answer and my question assumption are wrong. I need a ROI 80% wide, 20% high centered on the image, so it should be (0.1, 0.4, 0.8, 0.2) but because the image from the camera is rotated clockwise 90º, it will be (0.4, 0.1, 0.2, 0.8), right? or is the ROI always calculated as the image is upright? – Duck Feb 03 '21 at 11:45
  • I have tried the second rect and it does not work, as the first one does. So I assume the roi is always considered for the image upright. Another shame to the crappy documentation apple rights. – Duck Feb 03 '21 at 11:57
  • @Duck yes, ROI is for the upright image. That's what I meant by "processed image that is rotated to upright." I [answered](https://stackoverflow.com/a/66054211/14351818) on a related question, you might find it useful – aheze Feb 04 '21 at 21:50
2

Just to chime in here for added clarity, cuz this was also tripping me up.

The documentation for regionOfInterest says:

The default value is { { 0, 0 }, { 1, 1 } }

Which I was also confusing for 2 points (a bottom left corner and a top right corner). But that last pair is supposed to be the normalized width and height; not a normalized coordinate.

// ❌
request.regionOfInterest = CGRect(x: 0.1, y: 0.4, width: 0.9, height: 0.6)

// ✔️
request.regionOfInterest = CGRect(x: 0.1, y: 0.4, width: 0.8, height: 0.2)
Peter Parker
  • 2,121
  • 1
  • 17
  • 23
  • 1
    l love Apple documentation. – Duck Feb 21 '22 at 21:13
  • How to calculate the regionOfInterest correctly? In my case the cameraPreview frame is CGRect(0.0, 780.0, 810.0, 300.0) for iPad 8th Gen, and the AVCaptureVideoPreviewLayerInstance.videoGravity = .resizeAspectFill. It fills video output by skiping some portion from top and bottom by keeping aspect ratio. I am looking for a solution to scan the barcodes which displays inside the preview, currently it is scanning the barcode which are not showing in preview when barcode image is outside of top/bottom direction. Please provide answer for this. – Coder_A_D Jul 22 '22 at 12:01