How to Extract SceneKit Depth Buffer at runtime in AR scene?

Question

How does one extract the SceneKit depth buffer? I make an AR based app that is running Metal and I'm really struggling to find any info on how to extract a 2D depth buffer so I can render out fancy 3D photos of my scenes. Any help greatly appreciated.

Andy Jazz · Accepted Answer · 2022-05-10T14:30:38.760

Your question is unclear but I'll try to answer.

Depth pass from VR view

If you need to render a Depth pass from SceneKit's 3D environment then you should use, for instance, a SCNGeometrySource.Semantic structure. There are vertex, normal, texcoord, color and tangent type properties. Let's see what a vertex type property is:

static let vertex: SCNGeometrySource.Semantic

This semantic identifies data containing the positions of each vertex in the geometry. For a custom shader program, you use this semantic to bind SceneKit’s vertex position data to an input attribute of the shader. Vertex position data is typically an array of three- or four-component vectors.

Here's a code's excerpt from iOS Depth Sample project.

UPDATED: Using this code you can get a position for every point in SCNScene and assign a color for these points (this is what a zDepth channel really is):

import SceneKit

struct PointCloudVertex {
    var x: Float, y: Float, z: Float
    var r: Float, g: Float, b: Float
}

@objc class PointCloud: NSObject {
    
    var pointCloud : [SCNVector3] = []
    var colors: [UInt8] = []
    
    public func pointCloudNode() -> SCNNode {
        let points = self.pointCloud
        var vertices = Array(repeating: PointCloudVertex(x: 0,
                                                         y: 0,
                                                         z: 0,
                                                         r: 0,
                                                         g: 0,
                                                         b: 0), 
                                                     count: points.count)
        
        for i in 0...(points.count-1) {
            let p = points[i]
            vertices[i].x = Float(p.x)
            vertices[i].y = Float(p.y)
            vertices[i].z = Float(p.z)
            vertices[i].r = Float(colors[i * 4]) / 255.0
            vertices[i].g = Float(colors[i * 4 + 1]) / 255.0
            vertices[i].b = Float(colors[i * 4 + 2]) / 255.0
        }
        
        let node = buildNode(points: vertices)
        return node
    }
    
    private func buildNode(points: [PointCloudVertex]) -> SCNNode {
        let vertexData = NSData(
            bytes: points,
            length: MemoryLayout<PointCloudVertex>.size * points.count
        )
        let positionSource = SCNGeometrySource(
            data: vertexData as Data,
            semantic: SCNGeometrySource.Semantic.vertex,
            vectorCount: points.count,
            usesFloatComponents: true,
            componentsPerVector: 3,
            bytesPerComponent: MemoryLayout<Float>.size,
            dataOffset: 0,
            dataStride: MemoryLayout<PointCloudVertex>.size
        )
        let colorSource = SCNGeometrySource(
            data: vertexData as Data,
            semantic: SCNGeometrySource.Semantic.color,
            vectorCount: points.count,
            usesFloatComponents: true,
            componentsPerVector: 3,
            bytesPerComponent: MemoryLayout<Float>.size,
            dataOffset: MemoryLayout<Float>.size * 3,
            dataStride: MemoryLayout<PointCloudVertex>.size
        )
        let element = SCNGeometryElement(
            data: nil,
            primitiveType: .point,
            primitiveCount: points.count,
            bytesPerIndex: MemoryLayout<Int>.size
        )

        element.pointSize = 1
        element.minimumPointScreenSpaceRadius = 1
        element.maximumPointScreenSpaceRadius = 5

        let pointsGeometry = SCNGeometry(sources: [positionSource, colorSource], elements: [element])
        
        return SCNNode(geometry: pointsGeometry)
    }
}

Depth pass from AR view

If you need to render a Depth pass from ARSCNView it is possible only in case you're using ARFaceTrackingConfiguration for the front-facing camera. If so, then you can employ capturedDepthData instance property that brings you a depth map, captured along with the video frame.

var capturedDepthData: AVDepthData? { get }

But this depth map image is only 15 fps and of lower resolution than corresponding RGB image at 60 fps.

Face-based AR uses the front-facing, depth-sensing camera on compatible devices. When running such a configuration, frames vended by the session contain a depth map captured by the depth camera in addition to the color pixel buffer (see capturedImage) captured by the color camera. This property’s value is always nil when running other AR configurations.

And a real code could be like this:

extension ViewController: ARSCNViewDelegate {
    
    func renderer(_ renderer: SCNSceneRenderer, updateAtTime time: TimeInterval) {
        
        DispatchQueue.global().async {

            guard let frame = self.sceneView.session.currentFrame else {
                return
            }
            if let depthImage = frame.capturedDepthData {
                self.depthImage = (depthImage as! CVImageBuffer)
            }
        }
    }
}

Depth pass from Video view

Also, you can extract a true Depth pass using 2 back-facing cameras and AVFoundation framework.

Look at Image Depth Map tutorial where Disparity concept will be introduced to you.

Thanks for such a broad answer! I'm just after the depth buffer for the 3D opjects I'm passing into my back facing camera's feed - so that would be your first option? It's a bit of an ask, but if you could clarify that option a little bit further *please* - the other two techniques seem rather well documented out there, however my need, which I expected would be quite a normal request isn't. I just need a grey scale UIImage or even CVImage etc, happy to resize if necessary to match my RGB buffer I'm capturing. Thanks. — Dan M, May 03 '19 at 07:11
To be precise, what do you need exactly: DepthPass from SceneKit (for virtual objects) or DepthPass from ARSCNView (for real world objects)? Or both? — Andy Jazz, May 04 '19 at 07:37
Only virtual objects, my phone is too old for both unfortunately. — Dan M, May 05 '19 at 00:39
I updated my answer. There's a link to real `depth-channel-extraction` project. You can find there all features you're looking for. — Andy Jazz, May 05 '19 at 05:16
I'm trying be concise, but I'm only needing the virtual objects! - it's still going to be a great answer for every way though! Eg, just getting my runtime added geo's depth. So sorry, and really appreciate your answers. -- just compiling now, but the Readme.md seems to just be for the camera view and not the virtual objects. — Dan M, May 05 '19 at 05:28
A code in my updated answer is for extracting zChannel for `virtual objects`. It's not a ready application or ready-to-use solution, it's a starting point for implementation. — Andy Jazz, May 05 '19 at 07:27

How to Extract SceneKit Depth Buffer at runtime in AR scene?

1 Answers1

Depth pass from VR view

Depth pass from AR view

Depth pass from Video view