How to get the value of kCVPixelFormatType_DepthFloat16 (half-point float)?

Question

I am working with a project on iOS front depth camera with swift. The media type is kCVPixelFormatType_DepthFloat16, half-point float at 640*360 dimension with 30fps according to apple documentation. I was stuck in how to further retrieve and process the value pixel by pixel.

let buffer:CVPixelBuffer = depthData.depthDataMap //depthData is AVDepthData type
CVPixelBufferLockBaseAddress(buffer, CVPixelBufferLockFlags(rawValue: 0))
let width = CVPixelBufferGetWidth(buffer)
let height = CVPixelBufferGetHeight(buffer)
for y in 0 ..< height {
  for x in 0 ..< width {
    let pixel = ?? //what should I do here?
  }
}

Looking for the same infos, can't help you atm~ – itMaxence Mar 02 '20 at 14:42 — itMaxence, Mar 02 '20 at 14:42
@itMaxence I have solved it! You can look at my own answer. – debiluz Mar 04 '20 at 02:31 — debiluz, Mar 04 '20 at 02:31

score 3 · Accepted Answer · answered Mar 04 '20 at 02:23

I have solved my problem already. This can be done in two ways.

Use kCVPixelFormatType_DepthFloat32 instead of kCVPixelFormatType_DepthFloat16, it will have the same dimension and fps as the previous depthmap. Then you can convert it into Swift Float type like the following:

let width = CVPixelBufferGetWidth(buffer)
let height = CVPixelBufferGetHeight(buffer)

CVPixelBufferLockBaseAddress(buffer, CVPixelBufferLockFlags(rawValue: 0))
let floatBuffer = unsafeBitCast(CVPixelBufferGetBaseAddress(buffer), to: UnsafeMutablePointer<Float>.self)

for y in 0 ..< height {
    for x in 0 ..< width {
        let pixel = floatBuffer[y*width+x]
    }
}
CVPixelBufferUnlockBaseAddress(self, CVPixelBufferLockFlags(rawValue: 0))

The second way is to convert to UInt16 first, then add two zero bytes before it

// to access the point height = y, width = x, thanks to this project https://github.com/edvardHua/Articles/tree/master/%5BAR:MR%20%E5%9F%BA%E7%A1%80%5D%20%E5%88%A9%E7%94%A8%20iPhone%20X%20%E7%9A%84%E6%B7%B1%E5%BA%A6%E7%9B%B8%E6%9C%BA(TruthDepth%20Camera)%E8%8E%B7%E5%BE%97%E5%83%8F%E7%B4%A0%E7%82%B9%E7%9A%84%E4%B8%89%E7%BB%B4%E5%9D%90%E6%A0%87/Obtain3DCoordinate
let rowData = CVPixelBufferGetBaseAddress(buffer)! + Int(y) *  CVPixelBufferGetBytesPerRow(buffer)
var f16Pixel = rowData.assumingMemoryBound(to: UInt16.self)[x]
var f32Pixel = Float(0.0)
var src = vImage_Buffer(data: &f16Pixel, height: 1, width: 1, rowBytes: 2)
var dst = vImage_Buffer(data: &f32Pixel, height: 1, width: 1, rowBytes: 4)
vImageConvert_Planar16FtoPlanarF(&src, &dst, 0)
let depth = f32Pixel //depth in cm

More infos I found: 1. The bytes buffer is "ROW major", 2. planes are used for different colors (so no plane for depth data) 3. Float16 to work with GPU and Float32 to work with CPU. 4. Normalize your 3D depth map with cameraCalibrationData if you do computer vision stuff. Best link: https://developer.apple.com/videos/play/wwdc2017/507 — itMaxence, Mar 04 '20 at 09:37
And a [link to better understand image formats](https://software.intel.com/en-us/ipp-dev-reference-pixel-and-planar-image-formats) used in CVPixelBuffer. — itMaxence, Mar 04 '20 at 12:15

score 0 · Answer 2 · edited Jul 12 '22 at 04:37

0

Can't you just do the following?

AVDepthData.converting(depthDataType : kCVPixelFormatType_DepthFloat16)

edited Jul 12 '22 at 04:37

Jeremy Caney

7,102
69
48
77

answered Jul 07 '22 at 17:19

Andrew McLane

1

While this code may solve the question, [including an explanation](//meta.stackexchange.com/q/114762) of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please [edit] your answer to add explanations and give an indication of what limitations and assumptions apply. – Yunnosch Sep 18 '22 at 17:24

How to get the value of kCVPixelFormatType_DepthFloat16 (half-point float)?

2 Answers2