I've read ARKit official tutorial RealtimeNumberReader, it uses AVCaptureSession and a specific function layerRectConverted which is only for AVCaptureSession to convert coordinates from bounding box to screen coordinate.
let rect = layer.layerRectConverted(fromMetadataOutputRect: box.applying(self.visionToAVFTransform))
Now I want to recognize text on ARFrame's capturedImage and then display the bound box on screen. Is it possible?
I know how to recognize text on a single image from official tutorial, my problem is how to convert the normalized box coordinate to viewport coordinate.
Please help and thank you very much!!!