10

I am attempting to find the depth data at a certain point in the captured image and return the distance in meters.

I have enabled depth data and am capturing the data alongside the image. I get the point from the X,Y coordinates of the center of the image (and when pressed) and convert it to the buffers index using

Int((width - touchPoint.x) * (height - touchPoint.y))

with WIDTH and HEIGHT being the dimensions of the captured image. I am not sure if this is the correct method to achieve this though.

I handle the depth data as such:

func handlePhotoDepthCalculation(point : Int) {

    guard let depth = self.photo else {
        return
    }

    //
    // Convert Disparity to Depth
    //
    let depthData = (depth.depthData as AVDepthData!).converting(toDepthDataType: kCVPixelFormatType_DepthFloat32)
    let depthDataMap = depthData.depthDataMap //AVDepthData -> CVPixelBuffer

    //
    // Set Accuracy feedback
    //
    let accuracy = depthData.depthDataAccuracy
    switch (accuracy) {
    case .absolute:
        /* 
        NOTE - Values within the depth map are absolutely 
        accurate within the physical world.
        */
        self.accuracyLbl.text = "Absolute"
        break
    case .relative:
        /* 
        NOTE - Values within the depth data map are usable for 
        foreground/background separation, but are not absolutely 
        accurate in the physical world. iPhone always produces this.
        */
        self.accuracyLbl.text = "Relative"
    }

    //
    // We convert the data
    //
    CVPixelBufferLockBaseAddress(depthDataMap, CVPixelBufferLockFlags(rawValue: 0))
    let depthPointer = unsafeBitCast(CVPixelBufferGetBaseAddress(depthDataMap), to: UnsafeMutablePointer<Float32>.self)

    //
    // Get depth value for image center
    //
    let distanceAtXYPoint = depthPointer[point]

    //
    // Set UI
    //
    self.distanceLbl.text = "\(distanceAtXYPoint) m" //Returns distance in meters?
    self.filteredLbl.text = "\(depthData.isDepthDataFiltered)" 
}

I am not convinced I am getting the correct position. From my research as well it looks like accuracy is only returned in .relative or .absolute and not a float/integer?

Andy Jazz
  • 49,178
  • 17
  • 136
  • 220
Allreadyhome
  • 1,252
  • 2
  • 25
  • 46

4 Answers4

5

To access the depth data at a CGPoint do:

let point = CGPoint(35,26)
let width = CVPixelBufferGetWidth(depthDataMap)
let distanceAtXYPoint = depthPointer[Int(point.y * CGFloat(width) + point.x)]

I hope it works.

Sergio Bonilla
  • 604
  • 5
  • 7
  • This is the correct answer. The depth data is a one-dimensional array representing all the points. To find the data for x,y you need y*width + x – ChrisH Oct 23 '18 at 19:32
  • This method resulted in incorrect and fluctuating values for me checkout https://stackoverflow.com/a/66636994/9142902 for my solution – i4guar Mar 15 '21 at 11:21
1

Access depth data at pixel position:

let depthDataMap: CVPixelBuffer = ...
let pixelX: Int = ...
let pixelY: Int = ...

CVPixelBufferLockBaseAddress(self, .readOnly)
let bytesPerRow = CVPixelBufferGetBytesPerRow(depthDataMap)
let baseAddress = CVPixelBufferGetBaseAddress(depthDataMap)!
assert(kCVPixelFormatType_DepthFloat32 == CVPixelBufferGetPixelFormatType(depthDataMap))

let rowData = baseAddress + pixelY * bytesPerRow
let distance = rowData.assumingMemoryBound(to: Float32.self)[pixelX]

CVPixelBufferUnlockBaseAddress(self, .readOnly)

For me the values where incorrect and inconsistent when accessing the depth by

let depthPointer = unsafeBitCast(CVPixelBufferGetBaseAddress(depthDataMap), to: UnsafeMutablePointer<Float32>.self)
i4guar
  • 599
  • 5
  • 10
0

Values indicating the general accuracy of a depth data map.

The accuracy of a depth data map is highly dependent on the camera calibration data used to generate it. If the camera's focal length cannot be precisely determined at the time of capture, scaling error in the z (depth) plane will be introduced. If the camera's optical center can't be precisely determined at capture time, principal point error will be introduced, leading to an offset error in the disparity estimate. These values report the accuracy of a map's values with respect to its reported units.

case relative

Values within the depth data map are usable for foreground/background separation, but are not absolutely accurate in the physical world.

case absolute

Values within the depth map are absolutely accurate within the physical world.

You have get CGPoint from AVDepthData buffer like hight and width like follow code.

// Useful data
 let width = CVPixelBufferGetWidth(depthDataMap) 
 let height = CVPixelBufferGetHeight(depthDataMap) 
BuLB JoBs
  • 841
  • 4
  • 20
  • I have already seen the question where you copied that anwser from. I have also already shown how I get my data point - `Int((width - touchPoint.x) * (height - touchPoint.y))` which is the same. This is just a description of the problem and not an answer imo. – Allreadyhome Nov 16 '17 at 10:22
0

In Apple's sample project they use the code below.

Texturepoint is the touch point projected to metal view used in the sample project.

// scale
let scale = CGFloat(CVPixelBufferGetWidth(depthFrame)) / CGFloat(CVPixelBufferGetWidth(videoFrame))
let depthPoint = CGPoint(x: CGFloat(CVPixelBufferGetWidth(depthFrame)) - 1.0 - texturePoint.x * scale, y: texturePoint.y * scale)
        
assert(kCVPixelFormatType_DepthFloat16 == CVPixelBufferGetPixelFormatType(depthFrame))
CVPixelBufferLockBaseAddress(depthFrame, .readOnly)
let rowData = CVPixelBufferGetBaseAddress(depthFrame)! + Int(depthPoint.y) * CVPixelBufferGetBytesPerRow(depthFrame)
// swift does not have an Float16 data type. Use UInt16 instead, and then translate
var f16Pixel = rowData.assumingMemoryBound(to: UInt16.self)[Int(depthPoint.x)]
CVPixelBufferUnlockBaseAddress(depthFrame, .readOnly)
        
var f32Pixel = Float(0.0)
var src = vImage_Buffer(data: &f16Pixel, height: 1, width: 1, rowBytes: 2)
var dst = vImage_Buffer(data: &f32Pixel, height: 1, width: 1, rowBytes: 4)
vImageConvert_Planar16FtoPlanarF(&src, &dst, 0)
        
// Convert the depth frame format to cm
let depthString = String(format: "%.2f cm", f32Pixel * 100)
Ozgur Sahin
  • 1,305
  • 16
  • 24