2

I am working with a project on iOS front depth camera with swift. The media type is kCVPixelFormatType_DepthFloat16, half-point float at 640*360 dimension with 30fps according to apple documentation. I was stuck in how to further retrieve and process the value pixel by pixel.

let buffer:CVPixelBuffer = depthData.depthDataMap //depthData is AVDepthData type
CVPixelBufferLockBaseAddress(buffer, CVPixelBufferLockFlags(rawValue: 0))
let width = CVPixelBufferGetWidth(buffer)
let height = CVPixelBufferGetHeight(buffer)
for y in 0 ..< height {
  for x in 0 ..< width {
    let pixel = ?? //what should I do here?
  }
}
debiluz
  • 51
  • 8

2 Answers2

3

I have solved my problem already. This can be done in two ways.

  1. Use kCVPixelFormatType_DepthFloat32 instead of kCVPixelFormatType_DepthFloat16, it will have the same dimension and fps as the previous depthmap. Then you can convert it into Swift Float type like the following:
let width = CVPixelBufferGetWidth(buffer)
let height = CVPixelBufferGetHeight(buffer)

CVPixelBufferLockBaseAddress(buffer, CVPixelBufferLockFlags(rawValue: 0))
let floatBuffer = unsafeBitCast(CVPixelBufferGetBaseAddress(buffer), to: UnsafeMutablePointer<Float>.self)

for y in 0 ..< height {
    for x in 0 ..< width {
        let pixel = floatBuffer[y*width+x]
    }
}
CVPixelBufferUnlockBaseAddress(self, CVPixelBufferLockFlags(rawValue: 0))
  1. The second way is to convert to UInt16 first, then add two zero bytes before it
// to access the point height = y, width = x, thanks to this project https://github.com/edvardHua/Articles/tree/master/%5BAR:MR%20%E5%9F%BA%E7%A1%80%5D%20%E5%88%A9%E7%94%A8%20iPhone%20X%20%E7%9A%84%E6%B7%B1%E5%BA%A6%E7%9B%B8%E6%9C%BA(TruthDepth%20Camera)%E8%8E%B7%E5%BE%97%E5%83%8F%E7%B4%A0%E7%82%B9%E7%9A%84%E4%B8%89%E7%BB%B4%E5%9D%90%E6%A0%87/Obtain3DCoordinate
let rowData = CVPixelBufferGetBaseAddress(buffer)! + Int(y) *  CVPixelBufferGetBytesPerRow(buffer)
var f16Pixel = rowData.assumingMemoryBound(to: UInt16.self)[x]
var f32Pixel = Float(0.0)
var src = vImage_Buffer(data: &f16Pixel, height: 1, width: 1, rowBytes: 2)
var dst = vImage_Buffer(data: &f32Pixel, height: 1, width: 1, rowBytes: 4)
vImageConvert_Planar16FtoPlanarF(&src, &dst, 0)
let depth = f32Pixel //depth in cm
debiluz
  • 51
  • 8
  • More infos I found: 1. The bytes buffer is "ROW major", 2. planes are used for different colors (so no plane for depth data) 3. Float16 to work with GPU and Float32 to work with CPU. 4. Normalize your 3D depth map with cameraCalibrationData if you do computer vision stuff. Best link: https://developer.apple.com/videos/play/wwdc2017/507 – itMaxence Mar 04 '20 at 09:37
  • And a [link to better understand image formats](https://software.intel.com/en-us/ipp-dev-reference-pixel-and-planar-image-formats) used in CVPixelBuffer. – itMaxence Mar 04 '20 at 12:15
0

Can't you just do the following?

AVDepthData.converting(depthDataType : kCVPixelFormatType_DepthFloat16)
Jeremy Caney
  • 7,102
  • 69
  • 48
  • 77
  • While this code may solve the question, [including an explanation](//meta.stackexchange.com/q/114762) of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please [edit] your answer to add explanations and give an indication of what limitations and assumptions apply. – Yunnosch Sep 18 '22 at 17:24