Questions tagged [apple-vision]

Apple Vision is a high-level computer vision framework to identify faces, detect and track features, and classify images, video, tabular data, audio and motion sensors data.

Apple Vision framework performs face and face landmark detection on input images and video, barcode recognition, image registration, text detection, and, of course, feature tracking. Vision API allows the use of custom CoreML models for tasks like classification or object detection.

205 questions
1
vote
0 answers

subclassing Apple's VNImageBasedRequest for my own opencv process

I'm currently using OpenCV to process an image, while also using VNCoreMLRequest requests/handlers (unrelated) like so: var image_cap = CameraCaptureManager.GetImageFromPixelBuffer(pixelBuffer) let imageRequestHandler =…
justin
  • 651
  • 1
  • 8
  • 18
1
vote
0 answers

Scanning VIN barcodes with apple's vision framework

I can scan Code 39 barcodes but scanning a vin barcode (which is a subset of Code 39) doesn't work. Does anyone know if it's possible to scan a VIN barcode using VNDetectBarcodesRequest. func processClassification(_ request: VNRequest) { guard…
zer0day
  • 236
  • 1
  • 3
  • 11
1
vote
0 answers

How can I implement my Hand Pose Classification (WWDC21) ML Model in a Swift app

Last month I created a Hand Pose Classification .mlmodel using CreateML for a school project, my model works great in the live preview and when I upload an image to it. I tried following the video that apple provided in their Developer website and…
1
vote
0 answers

Convert image in CVPixelBuffer to greyscale

I'm trying to convert the CVPixelBuffer I get from the iPhone camera (using CMSampleBufferGetImageBuffer) to a grayscale CVPixelBuffer, for use in CoreML. Code looks like this: extension CVPixelBuffer { enum Error: Swift.Error { …
Drew McCormack
  • 3,490
  • 1
  • 19
  • 23
1
vote
1 answer

Convert points from Vision coordinates to UIKit coordinates in VNDetectHumanHandPoseRequest

I'm trying to implement Detecting Hand Poses with Vision in ARSCNView. I've successfully made the VNDetectHumanHandPoseRequest and get detected hand poses result and convert vision coordinates to AVFoundation coordinates but now I want to convert…
TechGps1
  • 21
  • 3
1
vote
1 answer

The best way to overlay 3D objects on coordinates recognized by Vision

I'm trying to render a 3D object at coordinates recognized by the Vision framework. I already know SceneKit, and I want a way to have realistic rendering. I'm wondering if there's a more appropriate way other than SceneKit. Thanks in advance for…
1
vote
2 answers

Swift Vision Framework - VNRecognizeTextRequest: argument passed to call that takes no arguments

I'm currently building a small CLI tool in Swift 5.4 and wanted to use the Vision Framework to extract all the text in an Image. The Article I'm following to accomplish this task is provided by Apple and can be found here. As you can see my code is…
JBDev
  • 144
  • 1
  • 10
1
vote
1 answer

Vision language detection

I'm using Vision provided by Apple to convert some images into text. It's working well, but the problem I currently have is with Chinese characters. I'm doing this currently: let request = VNRecognizeTextRequest(completionHandler:…
1
vote
0 answers

What is the best way to utilize memory inside of "session(_:didUpdate:)" method?

My use case is I want to calculate various gestures of a hand (the first hand) seen by the camera. I am able to find body anchors and hand anchors and poses. See my video here. I am trying to utilize previous position SIMD3 information to calculate…
1
vote
0 answers

ARKit – Track more than one person by VNDetectHumanBodyPoseRequest

Is there any way to track multiple bodies using VNDetectHumanBodyPoseRequest and work with resulted rigs in ARKit?
H.Y.C.
  • 11
  • 1
1
vote
1 answer

How to make a Vision output display into a UI?

I am relatively new to coding, and I have recently been working on a program that allows a user to scan a crystal, using the iPhones rear camera, and it will identify what kind of crystal it is. I used CreateML to build the model, and Vision to…
1
vote
0 answers

ARKit and ResNet 50

I made recognition using ResNet 50. How do I add the ability to find out the distance from one object to another using ARKit, so that the name of the object is displayed on the screen in three-dimensional letters? I did it in UIKit's Storyboard, not…
Mara
  • 11
  • 1
1
vote
1 answer

Using CoreML to classify NSImages

I'm trying to work with Xcode CoreML to classify images that are simply single digits or letters. To start out with I'm just usiing .png images of digits. Using Create ML tool, I built an image classifier (NOT including any Vision support stuff) and…
user2132980
  • 195
  • 2
  • 10
1
vote
0 answers

Vision image analyzation/segmentation

I'm running into a roadblock as to how to deal with this issue. For instance, if someone takes a picture of, say, a screwdriver, I want my app to be able to identify the screwdriver as well as black it out. I think I would use CoreML/Vision to…
1
vote
1 answer

How to extract outer lips from its feature point using vision framework in swift

I implemented addFaceLandmarksToImage function to crop the outer lips of the image. addFaceLandmarksToImage function first detects the face on the image using vision, convert the face bounding box size and origin to image size and origin. then I…
Roman
  • 35
  • 1
  • 5