4

Does anyone know how to reproduce the new Notes new scanning feature in iOS 11??

Is AVFoundation used for the camera?

How is the camera detecting the shape of the paper/document/card?

How do they place the overlay over in real time?

How does the camera know when to take the photo?

What's that animated overlay and how can we achieve this?

jscs
  • 63,694
  • 13
  • 151
  • 195
R.Radev
  • 81
  • 7

2 Answers2

3
  • Does anyone know how to reproduce this? Not exactly :P

  • Is AVFoundation used for the camera? Yes

  • How is the camera detecting the shape of the paper/document/card? They are using the Vision Framework to do rectangle detection. It's stated in this WWDC session by one of the demonstrators

  • How do they place the overlay over in real time? You Should check out the above video for this as he talks about doing something similar in one of the demos

  • How does the camera know when to take the photo? I'm not familiar with this app but it's surely triggered in the capture session, no?

  • Whats that animated overlay and how can we achieve this? Not sure about this but I'd imagine it's some kind of CALayer with animation

  • Is Tesseract framework used for the image afterwards? Isn't Tesseract OCR for text? If you're looking for handwriting recognition, you might want to look for a MNIST model

Brian
  • 924
  • 13
  • 22
  • hm good point on the WWDC session. Just watched it and they really say that Notes are using Vision. I'll do some tests and demos with it...including CALayer, to see if I'll be able to reproduce it. You are right bout Tesseract...I am doing some kind of text detection app but its not actually related to the topic so I'll delete the line. MNIST doesn't really helps me. Thanks anyway. :) – R.Radev Nov 11 '17 at 16:52
1

Use Apple’s rectangle detection SDK, which provides an easy-to-use API that can identify rectangles in still images or video sequences in near-realtime. The algorithm works very well in simple scenes with a single prominent rectangle in a clean background, but is less accurate in more complicated scenes, such as capturing small receipts or business cards in cluttered backgrounds, which are essential use-cases for our scanning feature.

An image processor that identifies notable features (such as faces and barcodes) in a still image or video.

https://developer.apple.com/documentation/coreimage/cidetector