0

I am trying to create a real time video processing app, in which I need to get the RGBA values of all pixels for each frame, and process them using an external library, and show them. I am trying to get the RGBA value for each pixel, but it is too slow the way I am doing it, I was wondering if there is a way to do it faster, using VImage. This is my current code, and the way I get all the pixels, as I get the current frame:

        guard let cgImage = context.makeImage() else {
        return nil
    }
    guard let data = cgImage.dataProvider?.data,
    let bytes = CFDataGetBytePtr(data) else {
    fatalError("Couldn't access image data")
    }
    assert(cgImage.colorSpace?.model == .rgb)
    let bytesPerPixel = cgImage.bitsPerPixel / cgImage.bitsPerComponent
    gp.async {
        for y in 0 ..< cgImage.height {
            for x in 0 ..< cgImage.width {
                let offset = (y * cgImage.bytesPerRow) + (x * bytesPerPixel)
                let components = (r: bytes[offset], g: bytes[offset + 1], b: bytes[offset + 2])
                print("[x:\(x), y:\(y)] \(components)")
            }
            print("---")
        }

    }

This is the version using the VImage, but I there is some memory leak, and I can not access the pixels

        guard
        let format = vImage_CGImageFormat(cgImage: cgImage),
        var buffer = try? vImage_Buffer(cgImage: cgImage,
                                        format: format) else {
            exit(-1)
        }

    let rowStride = buffer.rowBytes / MemoryLayout<Pixel_8>.stride / format.componentCount
    do {
        
        let componentCount = format.componentCount
        var argbSourcePlanarBuffers: [vImage_Buffer] = (0 ..< componentCount).map { _ in
            guard let buffer1 = try? vImage_Buffer(width: Int(buffer.width),
                                                   height: Int(buffer.height),
                                                  bitsPerPixel: format.bitsPerComponent) else {
                                                    fatalError("Error creating source buffers.")
            }
            return buffer1
        }
        vImageConvert_ARGB8888toPlanar8(&buffer,
                                        &argbSourcePlanarBuffers[0],
                                        &argbSourcePlanarBuffers[1],
                                        &argbSourcePlanarBuffers[2],
                                        &argbSourcePlanarBuffers[3],
                                        vImage_Flags(kvImageNoFlags))

        let n = rowStride * Int(argbSourcePlanarBuffers[1].height) * format.componentCount
        let start = buffer.data.assumingMemoryBound(to: Pixel_8.self)
        var ptr = UnsafeBufferPointer(start: start, count: n)

        print(Array(argbSourcePlanarBuffers)[1]) // prints the first 15 interleaved values
        buffer.free()
    }
  • Your calculation for bytes per pixel is actually calculating components per pixel. Bytes per pixel is bitsPerPixel/8. – Ian Ollmann Dec 05 '21 at 05:46
  • Thank you so much for your response Ian. I am still struggling with this problem. Can you give an example of the correct way of getting the R,G,B,A value of each pixel usig the VImages.where is exactly the problem you are mentioneting in the code? @IanOllmann – Parham Khamsepour Dec 06 '21 at 04:18

2 Answers2

1

You can access the underlying pixels in a vImage buffer to do this.

For example, given an image named cgImage, use the following code to populate a vImage buffer:

guard
    let format = vImage_CGImageFormat(cgImage: cgImage),
    let buffer = try? vImage_Buffer(cgImage: cgImage,
                                    format: format) else {
        exit(-1)
    }

let rowStride = buffer.rowBytes / MemoryLayout<Pixel_8>.stride / format.componentCount

Note that a vImage buffer's data may be wider than the image (see: https://developer.apple.com/documentation/accelerate/finding_the_sharpest_image_in_a_sequence_of_captured_images) which is why I've added rowStride.

To access the pixels as a single buffer of interleaved values, use:

do {
    let n = rowStride * Int(buffer.height) * format.componentCount
    let start = buffer.data.assumingMemoryBound(to: Pixel_8.self)
    let ptr = UnsafeBufferPointer(start: start, count: n)
    
    print(Array(ptr)[ 0 ... 15]) // prints the first 15 interleaved values
}

To access the pixels as a buffer of Pixel_8888 values, use (make sure that format.componentCount is 4:

do {
    let n = rowStride * Int(buffer.height)
    let start = buffer.data.assumingMemoryBound(to: Pixel_8888.self)
    let ptr = UnsafeBufferPointer(start: start, count: n)
    
    print(Array(ptr)[ 0 ... 3]) // prints the first 4 pixels
}
Flex Monkey
  • 3,583
  • 17
  • 19
  • using the let ptr = UnsafeBufferPointer(start: start, count: n), can I use vimage, to change the interleaved matrix of r,g,b,a to a planar one with values of r,g,b,a being in their own buffer? – Parham Khamsepour Nov 18 '21 at 14:22
  • Yep. `vImageConvert_ARGB8888toPlanar8` will populate four planar buffers from an interleaved 4-channel buffer. – Flex Monkey Nov 18 '21 at 14:48
  • Base on the answer you gave, I am using the `let buffer = try? vImage_Buffer(cgImage: cgImage,format: format)` and pass it to `vImageConvert_ARGB8888toPlanar8 `. how can I access the buffers from `vImageConvert_ARGB8888toPlanar8 ` ? for example if i want to access R's 100x100th pixel. Also, this seems to have a memory leak. is there a way to not have a memory leak? – Parham Khamsepour Nov 18 '21 at 15:19
  • Once you've created and populated the RGBA buffer, you need to create four planar 8-bit buffers that you pass as the destinations to `vImageConvert_ARGB8888toPlanar8`. For each of those planar buffers, use the `buffer.data.assumingMemoryBound(to: Pixel_8.self)` example - each of those buffers contains the pixel values for their corresponding color channel. Whenever you're finished working with a vImage buffer, you need to deallocate its memory. – Flex Monkey Nov 18 '21 at 15:25
  • Thank you. I edited my original post, and added the VImage part to it, I cant find what am i doing wrong. can you please take a look at the code using VImage, and help me to reduce the memory leak and access each RGBA buffer? – Parham Khamsepour Nov 18 '21 at 15:34
  • I am totally new to this field, and this is my first time. I would really appreciate it if you can help me fix this code. – Parham Khamsepour Nov 18 '21 at 15:35
  • That looks pretty much spot on. The second to last line should be `print(Array(ptr)[1])` - that will print pixel `1` of `argbSourcePlanarBuffers[1]`. You also need to free all the planar buffers, not just the ARGB one. – Flex Monkey Nov 18 '21 at 15:42
  • thank you. `print(Array(ptr)[1])` by doing this, it will print the the pixel at index 1 from `argbSourcePlanarBuffers[1]`? how can I get like pixel (100,100) from the `argbSourcePlanarBuffers[1]`. Also Is it right to have `let start = buffer.data.assumingMemoryBound(to: Pixel_8.self)` or should i have `let start = argbSourcePlanarBuffers[1].data.assumingMemoryBound(to: Pixel_8.self)`? – Parham Khamsepour Nov 18 '21 at 15:47
  • If you want to see the pixels in one of the planar buffers, you need `let start = argbSourcePlanarBuffers[1].data.assumingMemoryBound(to: Pixel_8.self)` – Flex Monkey Nov 18 '21 at 15:49
  • doing this give me an error `Swift/UnsafePointer.swift:832: Fatal error: UnsafeMutablePointer.initialize overlapping range` – Parham Khamsepour Nov 18 '21 at 15:52
  • So even if I cant print them, by calling `vImageConvert_ARGB8888toPlanar8` each of 4 `argbSourcePlanarBuffers`s contain the A,R,G,B values, and i can pass them to my library? – Parham Khamsepour Nov 18 '21 at 15:58
  • ` let start = argbSourcePlanarBuffers[1].data.assumingMemoryBound(to: Pixel_8.self)` works fine for me. You have pointers (or arrays if you want them) of the pixels for each channel. You can do whatever you like with them – Flex Monkey Nov 18 '21 at 16:16
  • Thank you so much for the help. My problem was that I didnt change `rowStride` value. Also `print(Array(ptr)[307199])` prints the last bottom right corner pixel of 480*640? or is it counted in a different way, or 0,0 is top left and bottom right is 307199? – Parham Khamsepour Nov 18 '21 at 16:25
  • Sorry, to bother you again. What is the order of `argbSourcePlanarBuffers`. Is `argbSourcePlanarBuffers[0]` alpha, `argbSourcePlanarBuffers[1]` Red, `argbSourcePlanarBuffers[2]` Green and `argbSourcePlanarBuffers[3]` Blue? because the values that im getting for each pixels color base on this is not the correct color – Parham Khamsepour Nov 18 '21 at 22:24
  • You need to look at the `CGImageAlphaInfo` in `vImage_CGImageFormat.bitmapInfo` to confirm the order. – Flex Monkey Nov 19 '21 at 08:39
  • since I am using `kCVPixelFormatType_32BGRA` and my bitmapinfo is `var bitmapInfo: UInt32 = CGBitmapInfo.byteOrder32Little.rawValue bitmapInfo |= CGImageAlphaInfo.premultipliedFirst.rawValue & CGBitmapInfo.alphaInfoMask.rawValue`. This means that `argbSourcePlanarBuffers[0]` is B, `argbSourcePlanarBuffers[1]` is G, `argbSourcePlanarBuffers[2]` is R and `argbSourcePlanarBuffers[3]` is A? is this correct? – Parham Khamsepour Nov 19 '21 at 16:14
0

This is the slowest way to do it. A faster way is with a custom CoreImage filter.

Faster than that is to write your own OpenGL Shader (or rather, it's equivalent in Metal for current devices)

I've written OpenGL shaders, but have not worked with Metal yet.

Both allow you to write graphics code that runs directly on the GPU.

Duncan C
  • 128,072
  • 22
  • 173
  • 272
  • Thank you. As I have no prior knowledge of OpenGL or Metal, I was wondering if you have a code snippet, or a tutorial that can help me turn a picture to get RGBA value for each pixel of a UIImage, or CGImage – Parham Khamsepour Nov 17 '21 at 22:17
  • I don’t. The OpenGL work I’ve done has been on MacOS, and iOS used OpenGL ES. The shader language is different. I haven’t done any work with Metal. New iOS devices use Metal, so you should use that, not OpenGL. – Duncan C Nov 18 '21 at 00:52
  • is it possible to just get the matrix for each pixel's r,g,b,a value using metal, and pass it to swift? because I have an external C++ library that I'm going to give these r,g,b, matrices to, and get some result back. Is using metal the best option just to extract these data for each frame, is it even possible? – Parham Khamsepour Nov 18 '21 at 01:06
  • Because I need the R,G,B,A values each as a matrix(2d array or buffer) that i can pass to my library. and each image is a frame from camera that needs to be proccessed fast. What is the best way to ahieve this? I hope it is more clear now – Parham Khamsepour Nov 18 '21 at 01:22
  • While using the GPU is indeed speedy, you will despair of ever getting a small answer for any metal programming problem. Between seeing up devices, command buffers, Jitting the code, setting up MTLResources and forgetting to synchronize the results back to the CPU, you are signing up for several pages of code, considerable head scratching and a fair bit of work. That said, assuming the camera image is coming back in an IOSurface or CVPixelBuffer there are few better ways to process it fast. – Ian Ollmann Dec 06 '22 at 09:09
  • You might take a look at MetalPerformanceShaders which should automate some of this process, provided it does what you want to do. – Ian Ollmann Dec 06 '22 at 09:10
  • You said “ Because I need the R,G,B,A values each as a matrix(2d array or buffer) that i can pass to my library.” If you want to use Metal to do the work on the GPU, you want to do EVERYTHING on the GPU. You don’t want to pass data back and forth between the GPU and CPU. Thus you’d want to rewrite your library in Metal. Otherwise you are likely better to do everything on the CPU, since passing data between the GPU and CPU slows things down dramatically. – Duncan C Dec 06 '22 at 11:22