0

I am using Swift's Vision Framework for Deep Learning and want to upload the input image to backend using REST API - for which I am converting my UIImage to MultipartFormData using jpegData() and pngData() function that swift natively offers.

I use session.sessionPreset = .vga640x480 to specify the image size in my app for processing.

I was seeing different size of image in backend - which I was able to confirm in the app because UIImage(imageData) converted from the image is of different size.

This is how I convert image to multipartData -

let multipartData = MultipartFormData()
if let imageData = self.image?.jpegData(compressionQuality: 1.0) {
    multipartData.append(imageData, withName: "image", fileName: "image.jpeg", mimeType: "image/jpeg")
}

This is what I see in Xcode debugger -

enter image description here

Andy Jazz
  • 49,178
  • 17
  • 136
  • 220
Madhav Thakker
  • 107
  • 1
  • 10
  • 1
    Do you know what `scale` means? :) You have converted size in points to size in pixels. – Sulthan Dec 12 '21 at 15:58
  • 1
    Your image is apparently 640×480 points with a scale of 2, resulting in 1280×960 pixels. We can either point you to routines to resize, or better, capture the original at a scale of 1. E.g., if capturing with `UIGraphicsImageRenderer`, specify the scale there, and then no subsequent resizing will be needed. – Rob Dec 12 '21 at 16:04
  • Thanks for the detailed explanation! The original purpose of using the `sessionPreset` as `vga640x480` was so that the original image is "scaled-down" to size (640, 480) because I don't need (1280, 960) for processing. If my image is 640x480 points with scale 2, does my deep learning model would still take the same to process as for a 1280x960 points with scale 1? – Madhav Thakker Dec 12 '21 at 16:14
  • Currently, I get `pixelBuffer` from the function - `didOutput sampleBuffer: CMSampleBuffer` by using `pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)` and then `CIImage(cvPixelBuffer: pixelBuffer)` which my vision model uses. What would a the better way of handling this be, so that I can keep using image of size (640, 480)? Thanks a lot for your help! – Madhav Thakker Dec 12 '21 at 16:17
  • png and jpeg using different compression algoritms, so you have different sizes. – Dmytro Hrebeniuk Dec 15 '21 at 09:38

1 Answers1

2

The following looks intuitive, but manifests the behavior you describe, whereby one ends up with a Data representation of the image with an incorrect scale and pixel size:

let ciImage = CIImage(cvImageBuffer: pixelBuffer) // 640×480
let image = UIImage(ciImage: ciImage)             // says it is 640×480 with scale of 1
guard let data = image.pngData() else { ... }     // but if you extract `Data` and then recreate image from that, the size will be off by a multiple of your device’s scale

However, if you create it via a CGImage, you will get the right result:

let ciImage = CIImage(cvImageBuffer: pixelBuffer)
let ciContext = CIContext()
guard let cgImage = ciContext.createCGImage(ciImage, from: ciImage.extent) else { return }
let image = UIImage(cgImage: cgImage)

You asked:

If my image is 640×480 points with scale 2, does my deep learning model would still take the same to process as for a 1280×960 points with scale 1?

There is no difference, as far as the model goes, between 640×480pt @ 2× versus 1280×960pt @ 1×.

The question is whether 640×480pt @ 2× is better than 640×480pt @ 1×: In this case, the model will undoubtedly generate better results, though possibly slower, with higher resolution images (though at 2×, the asset is roughly four times larger/slower to upload; on 3× device, it will be roughly nine times larger).

But if you look at the larger asset generated by the direct CIImage » UIImage process, you can see that it did not really capture a 1280×960 snapshot, but rather captured 640×480 and upscaled (with some smoothing), so you really do not have a more detailed asset to deal with and is unlikely to generate better results. So, you will pay the penalty of the larger asset, but likely without any benefits.

If you need better results with larger images, I would change the preset to a higher resolution but still avoid the scale based adjustment by using the CIContext/CGImage-based snippet shared above.

Rob
  • 415,655
  • 72
  • 787
  • 1,044