Questions tagged [apple-vision]

Apple Vision is a high-level computer vision framework to identify faces, detect and track features, and classify images, video, tabular data, audio and motion sensors data.

Apple Vision framework performs face and face landmark detection on input images and video, barcode recognition, image registration, text detection, and, of course, feature tracking. Vision API allows the use of custom CoreML models for tasks like classification or object detection.

205 questions
2
votes
0 answers

Get Depth from detected face using Vision and ARKit iOS

I am trying to achieve something like this using Vision and ARKit, so my idea is to get landmark points from Vision and deploy node using those points. I am using this demo as a reference. To date, I have been able to find the landmark points of the…
Zღk
  • 854
  • 7
  • 25
2
votes
2 answers

Apple Vision Framework: detect smiles or happy faces with observations?

I'm working on a project that uses the Vision Framework to detect faces in images and then uses a CoreML model to detect if the face is smiling. The problem is that the CoreML model file is nearly 500 MB. I don't want to bloat my app that…
HansG600
  • 260
  • 4
  • 12
2
votes
0 answers

Facial Tracking with ARKit (using Vision)

I am currently using Vision with ARKit to find any faces in the frame. The code I'm using to do this is below: func runFaceDetection() { let pixelBuffer : CVPixelBuffer? = (sceneView.session.currentFrame?.capturedImage) if…
Alex Wulff
  • 2,039
  • 3
  • 18
  • 29
2
votes
0 answers

Tracking eyes with Vision framework

How can you use the new Vision framework in iOS 11 to track eyes in a video while the head or camera is moving? (using the front camera). I've found VNDetectFaceLandmarksRequest to be very slow on my iPad - landmarks requests are performed roughly…
iosdude
  • 1,131
  • 10
  • 27
2
votes
0 answers

VNDetectFaceLandmarksRequest lags when used with AVCaptureVideoDataOutput

I am using a VNDetectFaceLandmarksRequest combined with a VNSequenceRequestHandler to process images coming from the AVCaptureVideoDataOutput delegate call: func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer:…
zumzum
  • 17,984
  • 26
  • 111
  • 172
2
votes
4 answers

Regarding code extract from Apple's WWDC 2017 Session 506 - Where is exifOrientationFromDeviceOrientation() defined?

In the WWDC 2017 video from session 506. There's a piece of the code in the 1st demo that looks like this: let exifOrientation = self.exifOrientationFromDeviceOrientation() The use of self. indicates that it is supposed to be a property from the…
Andre Guerra
  • 1,117
  • 1
  • 9
  • 18
2
votes
1 answer

only detect in a section of camera preview layer, iOS, Swift

I am trying to get a detection zone in a live preview on my camera preview layer. Is it possible for this, say there is a live feed and you have face detect on and as you look around it will only put a box around the face in a certain area for…
Tony Merritt
  • 1,177
  • 11
  • 35
2
votes
0 answers

iOS Vision Framework image rectification

I'd like to perform rectification of a pair of images using the iOS Vision Framework's VNHomographicImageRegistrationRequest. Is it possible? So far, I've obtained a 3x3 warp matrix that doesn't seem to rectify the images. How is the warp matrix…
2
votes
0 answers

Getting image part of VNTextObservation rectangles in Vision Framework

I am able to get the rectangles of text detected in vision framework video feed in iOS 11, but I am trying to get the image part of video that was recognized as a text or character. Someone can help in that? func detectTextHandler(request:…
2
votes
0 answers

Vision with camera live stream and ARKit

I'd like to implement an scenario like this: to use a camera live stream with Vision to detect some rectangles, then to process this output according to some logic, then to display AR elements according to the logic output with ARKit. The examples…
AppsDev
  • 12,319
  • 23
  • 93
  • 186
2
votes
0 answers

CoreMLTools converted Keras Model fails at VNCoreMLTransform

as I'm learning Apple's Vision and CoreML framework but got stuck on how to use my own re-trained models. I tried training a VG16 model with Keras based on this tutorial. Everything looks OK except for some Keras version warnings. Then I tried…
CodeBrew
  • 6,457
  • 2
  • 43
  • 48
2
votes
2 answers

How can i use the Object tracking API of vision framework on ios11?

// init bounding CGRect rect = CGRectMake(0, 0, 0.3, 0.3); VNSequenceRequestHandler* reqImages = [[VNSequenceRequestHandler alloc] init]; VNRectangleObservation* ObserveRect = [VNRectangleObservation…
Alberl
  • 43
  • 1
  • 6
2
votes
0 answers

VNDetectFaceLandmarksRequest 50% slower when using regionOfInterest

I am doing real time face recognition on a video stream. Right now, it's a bit slow, so I decided to use the regionOfInterest of my VNDetectFaceLandmarksRequest to reduce the size of the image where the algorithm has to do face recognition. The…
Antzi
  • 12,831
  • 7
  • 48
  • 74
1
vote
0 answers

Positional errors handling the detected text of SwiftUI's Vision Framework

I am new to SwiftUI, and I am trying to write a feature that takes into an image, detects the text from the image, translates the text through Google's API, and displays the translated text under or to the right of the original text. I use SwiftUI's…
Hardy Wen
  • 70
  • 6
1
vote
1 answer

How to apply Vision Framework to Video Playback

I only found examples mainly using the live camera capture to apply the vision framework, which I already have working. I also want to apply the body pose detection and drawing upon video playback. I have the following code which already plays back…