I am currently trying to achieve to draw boxes of the text that was recognized with Firebase ML Kit on top of the image. Currently, I did not have success yet and I can't see any box at all as they are all shown offscreen. I was looking at this article for a reference: https://medium.com/swlh/how-to-draw-bounding-boxes-with-swiftui-d93d1414eb00 and also at that project: https://github.com/firebase/quickstart-ios/blob/master/mlvision/MLVisionExample/ViewController.swift
This is the view where the boxes should be shown:
struct ImageScanned: View {
var image: UIImage
@Binding var rectangles: [CGRect]
@State var viewSize: CGSize = .zero
var body: some View {
// TODO: fix scaling
ZStack {
Image(uiImage: image)
.resizable()
.scaledToFit()
.overlay(
GeometryReader { geometry in
ZStack {
ForEach(self.transformRectangles(geometry: geometry)) { rect in
Rectangle()
.path(in: CGRect(
x: rect.x,
y: rect.y,
width: rect.width,
height: rect.height))
.stroke(Color.red, lineWidth: 2.0)
}
}
}
)
}
}
private func transformRectangles(geometry: GeometryProxy) -> [DetectedRectangle] {
var rectangles: [DetectedRectangle] = []
let imageViewWidth = geometry.frame(in: .global).size.width
let imageViewHeight = geometry.frame(in: .global).size.height
let imageWidth = image.size.width
let imageHeight = image.size.height
let imageViewAspectRatio = imageViewWidth / imageViewHeight
let imageAspectRatio = imageWidth / imageHeight
let scale = (imageViewAspectRatio > imageAspectRatio)
? imageViewHeight / imageHeight : imageViewWidth / imageWidth
let scaledImageWidth = imageWidth * scale
let scaledImageHeight = imageHeight * scale
let xValue = (imageViewWidth - scaledImageWidth) / CGFloat(2.0)
let yValue = (imageViewHeight - scaledImageHeight) / CGFloat(2.0)
var transform = CGAffineTransform.identity.translatedBy(x: xValue, y: yValue)
transform = transform.scaledBy(x: scale, y: scale)
for rect in self.rectangles {
let rectangle = rect.applying(transform)
rectangles.append(DetectedRectangle(width: rectangle.width, height: rectangle.height, x: rectangle.minX, y: rectangle.minY))
}
return rectangles
}
}
struct DetectedRectangle: Identifiable {
var id = UUID()
var width: CGFloat = 0
var height: CGFloat = 0
var x: CGFloat = 0
var y: CGFloat = 0
}
This is the view where this view is nested in:
struct StartScanView: View {
@State var showCaptureImageView: Bool = false
@State var image: UIImage? = nil
@State var rectangles: [CGRect] = []
var body: some View {
ZStack {
if showCaptureImageView {
CaptureImageView(isShown: $showCaptureImageView, image: $image)
} else {
VStack {
Button(action: {
self.showCaptureImageView.toggle()
}) {
Text("Start Scanning")
}
// show here View with rectangles on top of image
if self.image != nil {
ImageScanned(image: self.image ?? UIImage(), rectangles: $rectangles)
}
Button(action: {
self.processImage()
}) {
Text("Process Image")
}
}
}
}
}
func processImage() {
let scaledImageProcessor = ScaledElementProcessor()
if image != nil {
scaledImageProcessor.process(in: image!) { text in
for block in text.blocks {
for line in block.lines {
for element in line.elements {
self.rectangles.append(element.frame)
}
}
}
}
}
}
}
The calculation of the tutorial caused the rectangles being to big and the one of the sample project them being too small.
(Similar for height)
Unfortunately I can't find on which size Firebase determines the element's size.
This is how it looks like:
Without calculating the width & height at all, the rectangles seem to have about the size they are supposed to have (not exactly), so this gives me the assumption, that ML Kit's size calculation is not done in proportion to the image.size.height/width.