2

I'm using Google's Text detection API from MLKit to detect text from images. It seems to work perfectly on screenshots but when I try to use it on images taken in the app (using AVFoundation) or on photos uploaded from camera roll it spits out a small number of seemingly random characters.

This is my code for running the actual text detection:

func runTextRecognition(with image: UIImage) {
    let visionImage = VisionImage(image: image)
    textRecognizer.process(visionImage) { features, error in
        self.processResult(from: features, error: error)
    }
}

func processResult(from text: VisionText?, error: Error?) {
    guard error == nil, let text = text else {
        print("oops")
        return
    }
    let detectedText = text.text

    let okAlert = UIAlertAction(title: "OK", style: .default) { (action) in
        // handle user input
    }

    let alert = UIAlertController(title: "Detected text", message: detectedText, preferredStyle: .alert)
    alert.addAction(okAlert)

    self.present(alert, animated: true) {
        print("alert was presented")
    }
}

This is my code for using images from camera roll (works for screenshots, not for images taken by camera):

func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [UIImagePickerController.InfoKey : Any]) {
    if let image = info[.originalImage] as? UIImage {
        self.runTextRecognition(with: image)
        uploadView.image = image
    } else {
        print("error")
    }
    self.dismiss(animated: true, completion: nil)
}

This is my code for using photos taken on the camera inside the app (never works, results are always nonsense):

func photoOutput(_ output: AVCapturePhotoOutput,
                 didFinishProcessingPhoto photo: AVCapturePhoto,
                 error: Error?) {
    PHPhotoLibrary.shared().performChanges( {
        let creationRequest = PHAssetCreationRequest.forAsset()
        creationRequest.addResource(with: PHAssetResourceType.photo, data: photo.fileDataRepresentation()!, options: nil)
    }, completionHandler: nil)

    let testImage = UIImage(data: photo.fileDataRepresentation()!)

    self.runTextRecognition(with: testImage!)
}

And this is what I did for using test images that I put in Assets.xcassets (this is the only one that consistently works well):

let uiimage = UIImage(named: "testImage")

self.runTextRecognition(with: uiimage!)

I'm thinking my issues may lie in the orientation of the UIImage, but I'm not sure. Any help would be much appreciated!

J. Oh
  • 617
  • 1
  • 8
  • 14
  • 1
    Photos in Camera might be rotated a certain way.. but photos in XCAsset are always interpreted as Portrait by default.. Also they might be different resolutions so that may be a problem.. – Brandon Nov 05 '18 at 22:52
  • 1
    is the extension of the images in the assets.xcassets and image taken from camera are same? – Karthick Ramesh Nov 05 '18 at 23:02
  • @KarthickRamesh The images taken from camera don't really have an extension, they're created from the AVCapturePhoto object that is accessible from the photoOutput method in the AVCapturePhotoCaptureDelegate. The image from my assets.xcassets is a JPG. – J. Oh Nov 05 '18 at 23:09
  • @Brandon Hm, I'll look into this, thank you! – J. Oh Nov 05 '18 at 23:10
  • 1
    NSData *jpegData = UIImageJPEGRepresentation(image, compressionQuality); can you try converting the image taken from camera to jpg and try to do the same? If it works fine, i can post it as an answer. – Karthick Ramesh Nov 05 '18 at 23:15
  • I checked the .format property of the AVCapturePhotoSettings object I used for the photo capture and it was JPEG, so it should be creating the UIImage from JPEG data - also I'm not quite sure how to use Objective C code in Swift files. – J. Oh Nov 05 '18 at 23:33
  • You can use bridging-header to use obj-c in swift files. – Karthick Ramesh Nov 06 '18 at 00:48
  • 1
    @Brandon You were right, I did have to rotate the image! Thank you so much – J. Oh Nov 06 '18 at 05:10
  • Hello All, I don't understand this issue. As the image is already in a good orientation. So why I need to rotate the image to make it work properly? – Akruti May 02 '19 at 11:33

1 Answers1

3

If your imagepicker is working fine, the problem can be with the image orientation. For a quick test, you can capture multiple images in different orientation and see if it works.

My problem was the text recognition working from image picked from gallery but not from the camera. That was orientation issue.

Solution 1

Before converting into vision image, fix the image orientation as follows.

let fixedImage = pickedImage.fixImageOrientation()

Add this extension.

extension UIImage {
    func fixImageOrientation() -> UIImage {
        UIGraphicsBeginImageContext(self.size)
        self.draw(at: .zero)
        let fixedImage = UIGraphicsGetImageFromCurrentImageContext()
        UIGraphicsEndImageContext()
        return fixedImage ?? self
    } }

Solution 2

Firebase documentation provide a method to fix for all orientation.

func imageOrientation(
    deviceOrientation: UIDeviceOrientation,
    cameraPosition: AVCaptureDevice.Position
    ) -> VisionDetectorImageOrientation {
    switch deviceOrientation {
    case .portrait:
        return cameraPosition == .front ? .leftTop : .rightTop
    case .landscapeLeft:
        return cameraPosition == .front ? .bottomLeft : .topLeft
    case .portraitUpsideDown:
        return cameraPosition == .front ? .rightBottom : .leftBottom
    case .landscapeRight:
        return cameraPosition == .front ? .topRight : .bottomRight
    case .faceDown, .faceUp, .unknown:
        return .leftTop
    }
}

Create metada:

let cameraPosition = AVCaptureDevice.Position.back  // Set to the capture device you used.
let metadata = VisionImageMetadata()
metadata.orientation = imageOrientation(
    deviceOrientation: UIDevice.current.orientation,
    cameraPosition: cameraPosition
)

Set metadata to vision image.

let image = VisionImage(buffer: sampleBuffer)
image.metadata = metadata
Abhijith
  • 3,094
  • 1
  • 33
  • 36