4

Im using a media capture library called NextLevel which spits out a CMSampleBuffer on each frame. I want to take this buffer and feed it to GPUImage2 thru a rawDataInput and pass it over some filters and read it back from a rawDataOutput at the end of the chain...

CMSampleBuffer bytes -> rawDataInput -> someFilter -> someotherFilter -> rawDataOutput -> make a CVPixelBuffer for other stuff.

The problem is, how to convert a CMSampleBuffer to an array of UInt8 so that rawDataInput can take it in.

I have the following code, but its insanely slow... the frame goes all the way thru the chain and to the rawDataOuput.dataAvailableCallback but is as slow as 1 frame per second. I found this code online, no idea what it is doing mathematically, but I guess it is inefficient.

let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
    CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
    let lumaBaseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0)
    let chromaBaseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1)

    let width = CVPixelBufferGetWidth(pixelBuffer)
    let height = CVPixelBufferGetHeight(pixelBuffer)

    let lumaBytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0)
    let chromaBytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 1)
    let lumaBuffer = lumaBaseAddress?.assumingMemoryBound(to: UInt8.self)
    let chromaBuffer = chromaBaseAddress?.assumingMemoryBound(to: UInt8.self)

    var rgbaImage = [UInt8](repeating: 0, count: 4*width*height)
    for x in 0 ..< width {
        for y in 0 ..< height {
            let lumaIndex = x+y*lumaBytesPerRow
            let chromaIndex = (y/2)*chromaBytesPerRow+(x/2)*2
            let yp = lumaBuffer?[lumaIndex]
            let cb = chromaBuffer?[chromaIndex]
            let cr = chromaBuffer?[chromaIndex+1]

            let ri = Double(yp!)                                + 1.402   * (Double(cr!) - 128)
            let gi = Double(yp!) - 0.34414 * (Double(cb!) - 128) - 0.71414 * (Double(cr!) - 128)
            let bi = Double(yp!) + 1.772   * (Double(cb!) - 128)

            let r = UInt8(min(max(ri,0), 255))
            let g = UInt8(min(max(gi,0), 255))
            let b = UInt8(min(max(bi,0), 255))

            rgbaImage[(x + y * width) * 4] = b
            rgbaImage[(x + y * width) * 4 + 1] = g
            rgbaImage[(x + y * width) * 4 + 2] = r
            rgbaImage[(x + y * width) * 4 + 3] = 255
        }
    }

    self.rawInput.uploadBytes(rgbaImage, size: Size.init(width: Float(width), height: Float(height)), pixelFormat: PixelFormat.rgba)
    CVPixelBufferUnlockBaseAddress( pixelBuffer, CVPixelBufferLockFlags(rawValue: 0) );

Update 1

Im using a Camera Library called NextLevel to retrieve the camera frames (CMSampleBuffer) and feed them to the filterchain, in this case RawDataInput via an array of UInt8 bytes. Because NextLevel uses luma/chroma when possible, I commented the 5 lines in https://github.com/NextLevel/NextLevel/blob/master/Sources/NextLevel.swift#L1106 as @rythmic fishman commented. But the code above would break so I replaced it with the following.

let pixelBuffer: CVPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
    CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0));
    let width = CVPixelBufferGetWidth(pixelBuffer)
    let height = CVPixelBufferGetHeight(pixelBuffer)
    let baseAddress = CVPixelBufferGetBaseAddress(pixelBuffer)

    let int8Buffer = CVPixelBufferGetBaseAddress(pixelBuffer)?.assumingMemoryBound(to: UInt8.self)
    var rgbaImage = [UInt8](repeating: 0, count: 4*width*height)
    for i in 0 ..< (width*height*4){
        rgbaImage[i] = UInt8((int8Buffer?[i])!)
    }


    self.rawInput.uploadBytes(rgbaImage, size: Size.init(width: Float(width), height: Float(height)), pixelFormat: PixelFormat.rgba)

    CVPixelBufferUnlockBaseAddress(pixelBuffer,CVPixelBufferLockFlags(rawValue: 0))

This code works when NextLevel is not using luma/chroma, but the frames are still very very slow when displayed at the end of the filterchain using a GPUImage RenderView.

Update 2

So I decided to make a custom RawDataInput.swift based in the Camera.swift from GPUImage2. Because the Camera class takes frames from the native camera in CMSampleBuffer format, I thought.. well NextLevel is throwing exactly the same buffers, I can copy the implementation of GPUImage2 Camera class and remove everything that I dont need, and just leave 1 method that receives a CMSampleBuffer and process it. Turns out it works perfectly. EXCEPT... there is a lag (no dropped frames, just lag). I dont know where the bottle neck is, I was reading that processing/modifying the CMSampleBuffers coming out of the native camera and then displaying them.. can cause delays as mentioned in this question: How to keep low latency during the preview of video coming from AVFoundation?

I made a video of the lag that Im experiencing... https://www.youtube.com/watch?v=5DQRnOTi4wk

The top corner preview comes from NextLevel's 'previewLayer: AVCaptureVideoPreviewLayer' and the filtered preview is a GPUImage2 Renderview at the end of the chain.. running in an iPhone 6 at 1920px resolution and 7 filters. This lag doesnt happend with GPUImage2 Camera class.

Here is the custom RawDataInput I put together.

#if os(Linux)
#if GLES
    import COpenGLES.gles2
    #else
    import COpenGL
#endif
#else
#if GLES
    import OpenGLES
    #else
    import OpenGL.GL3
#endif
#endif

import AVFoundation

public enum PixelFormat {
    case bgra
    case rgba
    case rgb
    case luminance

    func toGL() -> Int32 {
        switch self {
            case .bgra: return GL_BGRA
            case .rgba: return GL_RGBA
            case .rgb: return GL_RGB
            case .luminance: return GL_LUMINANCE
        }
    }
}

// TODO: Replace with texture caches where appropriate
public class RawDataInput: ImageSource {
    public let targets = TargetContainer()

    let frameRenderingSemaphore = DispatchSemaphore(value:1)
    let cameraProcessingQueue = DispatchQueue.global(priority:DispatchQueue.GlobalQueuePriority.default)
    let captureAsYUV:Bool = true
    let yuvConversionShader:ShaderProgram?
    var supportsFullYUVRange:Bool = false

    public init() {
        if captureAsYUV {
            supportsFullYUVRange = false
            let videoOutput = AVCaptureVideoDataOutput()
            let supportedPixelFormats = videoOutput.availableVideoCVPixelFormatTypes
            for currentPixelFormat in supportedPixelFormats! {
                if ((currentPixelFormat as! NSNumber).int32Value == Int32(kCVPixelFormatType_420YpCbCr8BiPlanarFullRange)) {
                    supportsFullYUVRange = true
                }
            }

            if (supportsFullYUVRange) {
                yuvConversionShader = crashOnShaderCompileFailure("Camera"){try sharedImageProcessingContext.programForVertexShader(defaultVertexShaderForInputs(2), fragmentShader:YUVConversionFullRangeFragmentShader)}
            } else {
                yuvConversionShader = crashOnShaderCompileFailure("Camera"){try sharedImageProcessingContext.programForVertexShader(defaultVertexShaderForInputs(2), fragmentShader:YUVConversionVideoRangeFragmentShader)}
            }
        } else {
            yuvConversionShader = nil
        }

    }

    public func uploadPixelBuffer(_ cameraFrame: CVPixelBuffer ) {
        guard (frameRenderingSemaphore.wait(timeout:DispatchTime.now()) == DispatchTimeoutResult.success) else { return }

        let bufferWidth = CVPixelBufferGetWidth(cameraFrame)
        let bufferHeight = CVPixelBufferGetHeight(cameraFrame)

        CVPixelBufferLockBaseAddress(cameraFrame, CVPixelBufferLockFlags(rawValue:CVOptionFlags(0)))

        sharedImageProcessingContext.runOperationAsynchronously{
            let cameraFramebuffer:Framebuffer
            let luminanceFramebuffer:Framebuffer
            let chrominanceFramebuffer:Framebuffer
            if sharedImageProcessingContext.supportsTextureCaches() {
                var luminanceTextureRef:CVOpenGLESTexture? = nil
                let _ = CVOpenGLESTextureCacheCreateTextureFromImage(kCFAllocatorDefault, sharedImageProcessingContext.coreVideoTextureCache, cameraFrame, nil, GLenum(GL_TEXTURE_2D), GL_LUMINANCE, GLsizei(bufferWidth), GLsizei(bufferHeight), GLenum(GL_LUMINANCE), GLenum(GL_UNSIGNED_BYTE), 0, &luminanceTextureRef)
                let luminanceTexture = CVOpenGLESTextureGetName(luminanceTextureRef!)
                glActiveTexture(GLenum(GL_TEXTURE4))
                glBindTexture(GLenum(GL_TEXTURE_2D), luminanceTexture)
                glTexParameteri(GLenum(GL_TEXTURE_2D), GLenum(GL_TEXTURE_WRAP_S), GL_CLAMP_TO_EDGE)
                glTexParameteri(GLenum(GL_TEXTURE_2D), GLenum(GL_TEXTURE_WRAP_T), GL_CLAMP_TO_EDGE)
                luminanceFramebuffer = try! Framebuffer(context:sharedImageProcessingContext, orientation:.portrait, size:GLSize(width:GLint(bufferWidth), height:GLint(bufferHeight)), textureOnly:true, overriddenTexture:luminanceTexture)

                var chrominanceTextureRef:CVOpenGLESTexture? = nil
                let _ = CVOpenGLESTextureCacheCreateTextureFromImage(kCFAllocatorDefault, sharedImageProcessingContext.coreVideoTextureCache, cameraFrame, nil, GLenum(GL_TEXTURE_2D), GL_LUMINANCE_ALPHA, GLsizei(bufferWidth / 2), GLsizei(bufferHeight / 2), GLenum(GL_LUMINANCE_ALPHA), GLenum(GL_UNSIGNED_BYTE), 1, &chrominanceTextureRef)
                let chrominanceTexture = CVOpenGLESTextureGetName(chrominanceTextureRef!)
                glActiveTexture(GLenum(GL_TEXTURE5))
                glBindTexture(GLenum(GL_TEXTURE_2D), chrominanceTexture)
                glTexParameteri(GLenum(GL_TEXTURE_2D), GLenum(GL_TEXTURE_WRAP_S), GL_CLAMP_TO_EDGE)
                glTexParameteri(GLenum(GL_TEXTURE_2D), GLenum(GL_TEXTURE_WRAP_T), GL_CLAMP_TO_EDGE)
                chrominanceFramebuffer = try! Framebuffer(context:sharedImageProcessingContext, orientation:.portrait, size:GLSize(width:GLint(bufferWidth / 2), height:GLint(bufferHeight / 2)), textureOnly:true, overriddenTexture:chrominanceTexture)
            } else {
                glActiveTexture(GLenum(GL_TEXTURE4))
                luminanceFramebuffer = sharedImageProcessingContext.framebufferCache.requestFramebufferWithProperties(orientation:.portrait, size:GLSize(width:GLint(bufferWidth), height:GLint(bufferHeight)), textureOnly:true)
                luminanceFramebuffer.lock()

                glBindTexture(GLenum(GL_TEXTURE_2D), luminanceFramebuffer.texture)
                glTexImage2D(GLenum(GL_TEXTURE_2D), 0, GL_LUMINANCE, GLsizei(bufferWidth), GLsizei(bufferHeight), 0, GLenum(GL_LUMINANCE), GLenum(GL_UNSIGNED_BYTE), CVPixelBufferGetBaseAddressOfPlane(cameraFrame, 0))

                glActiveTexture(GLenum(GL_TEXTURE5))
                chrominanceFramebuffer = sharedImageProcessingContext.framebufferCache.requestFramebufferWithProperties(orientation:.portrait, size:GLSize(width:GLint(bufferWidth / 2), height:GLint(bufferHeight / 2)), textureOnly:true)
                chrominanceFramebuffer.lock()
                glBindTexture(GLenum(GL_TEXTURE_2D), chrominanceFramebuffer.texture)
                glTexImage2D(GLenum(GL_TEXTURE_2D), 0, GL_LUMINANCE_ALPHA, GLsizei(bufferWidth / 2), GLsizei(bufferHeight / 2), 0, GLenum(GL_LUMINANCE_ALPHA), GLenum(GL_UNSIGNED_BYTE), CVPixelBufferGetBaseAddressOfPlane(cameraFrame, 1))
            }

            cameraFramebuffer = sharedImageProcessingContext.framebufferCache.requestFramebufferWithProperties(orientation:.portrait, size:luminanceFramebuffer.sizeForTargetOrientation(.portrait), textureOnly:false)

            let conversionMatrix:Matrix3x3
            if (self.supportsFullYUVRange) {
                conversionMatrix = colorConversionMatrix601FullRangeDefault
            } else {
                conversionMatrix = colorConversionMatrix601Default
            }
            convertYUVToRGB(shader:self.yuvConversionShader!, luminanceFramebuffer:luminanceFramebuffer, chrominanceFramebuffer:chrominanceFramebuffer, resultFramebuffer:cameraFramebuffer, colorConversionMatrix:conversionMatrix)


            //ONLY RGBA
            //let cameraFramebuffer:Framebuffer = sharedImageProcessingContext.framebufferCache.requestFramebufferWithProperties(orientation:.portrait, size:GLSize(width:GLint(bufferWidth), height:GLint(bufferHeight)), textureOnly:true)
            //glBindTexture(GLenum(GL_TEXTURE_2D), cameraFramebuffer.texture)
            //glTexImage2D(GLenum(GL_TEXTURE_2D), 0, GL_RGBA, GLsizei(bufferWidth), GLsizei(bufferHeight), 0, GLenum(GL_BGRA), GLenum(GL_UNSIGNED_BYTE), CVPixelBufferGetBaseAddress(cameraFrame))

            CVPixelBufferUnlockBaseAddress(cameraFrame, CVPixelBufferLockFlags(rawValue:CVOptionFlags(0)))


            self.updateTargetsWithFramebuffer(cameraFramebuffer)
            self.frameRenderingSemaphore.signal()

        }
    }

    public func uploadBytes(_ bytes:[UInt8], size:Size, pixelFormat:PixelFormat, orientation:ImageOrientation = .portrait) {
        let dataFramebuffer = sharedImageProcessingContext.framebufferCache.requestFramebufferWithProperties(orientation:orientation, size:GLSize(size), textureOnly:true, internalFormat:pixelFormat.toGL(), format:pixelFormat.toGL())

        glActiveTexture(GLenum(GL_TEXTURE1))
        glBindTexture(GLenum(GL_TEXTURE_2D), dataFramebuffer.texture)
        glTexImage2D(GLenum(GL_TEXTURE_2D), 0, GL_RGBA, size.glWidth(), size.glHeight(), 0, GLenum(pixelFormat.toGL()), GLenum(GL_UNSIGNED_BYTE), bytes)

        updateTargetsWithFramebuffer(dataFramebuffer)
    }

    public func transmitPreviousImage(to target:ImageConsumer, atIndex:UInt) {
        // TODO: Determine if this is necessary for the raw data uploads
//        if let buff = self.dataFramebuffer {
//            buff.lock()
//            target.newFramebufferAvailable(buff, fromSourceIndex:atIndex)
//        }
    }
}

I just don't understand why is that lag, if it's no different from GPUImage2 Camera class. NextLevel is not doing any other processing over those frames, it is just passing them over, so why the delay ?

Community
  • 1
  • 1
omarojo
  • 1,197
  • 1
  • 13
  • 26
  • can't GPUImage2 work with luma/chroma? If not you could try working with RGBA at the start of the pipeline (you didn't say where each frame was coming from - the camera?) or if you must convert to RGBA you could do it on the GPU. – Rhythmic Fistman Mar 21 '17 at 12:14
  • it comes from a third party Camera Capture library. So yes I suppose its a CMSamplebuffer coming from the camera. – omarojo Mar 21 '17 at 21:58
  • Can you configure the camera capture library to use RGBA? If not, can you discard the library? It's not that hard getting samples from the camera, unless the library does something pretty special it may not be adding much value to your project. – Rhythmic Fistman Mar 21 '17 at 22:01
  • This is the library im using https://github.com/NextLevel/NextLevel it does many things that GPUImage2's Camera class doesnt do, things as simple as landscape pictures/videos, which I dont know why there is no support for it that I can see in GPUImage2 – omarojo Mar 21 '17 at 22:37
  • It looks like NextLevel will always prefer luma/chroma when possible. You could change this and see if your performance improves by commenting out the 5 lines starting here: https://github.com/NextLevel/NextLevel/blob/master/Sources/NextLevel.swift#L1106 – Rhythmic Fistman Mar 22 '17 at 00:38
  • If I do that, it always breaks in the line 'let ri = Double(yp!) ' from the above code. ' unexpectedly found nil while unwrapping an Optional value'. – omarojo Mar 22 '17 at 02:27
  • You don't need to do your conversion in that case - `pixelBuffer` will already be RGBA and you can pass it straight to GPUImage2 – Rhythmic Fistman Mar 22 '17 at 13:45
  • but hooow :) how do I pass a CMSampleBuffer to GPUImage2 ? What Input class should I use? rawDataInput requires an array of UInt8, I updated my question. – omarojo Mar 22 '17 at 20:34
  • You need to pass `baseAddress`, which is an exercise in swift casting rules. You shouldn't need to copy the bytes. – Rhythmic Fistman Mar 22 '17 at 20:36
  • Maybe raise a (question) issue on the GPUImage2 github page. GPUImage2 does understand luma/chroma, a higher level view would definitely help here. – Rhythmic Fistman Mar 22 '17 at 23:31
  • @RhythmicFistman I updated the question, I have it working now with chroma/luma. RawDataInput was of no help, so I created my own Input class based in the Camera Class that GPUImage2 offers. Way faster than before.. but still a delay. :( UI interaction is not blocked anymore, but the lag is something I dont understand. – omarojo Mar 28 '17 at 23:40
  • can you create a small project that reproduces this? – Rhythmic Fistman Mar 29 '17 at 01:03
  • @RhythmicFistman here it is the sample. the Filterchain.swift class has everything. And the custom RawDataInput is in the GPUImage project. The delegate method from NextLevel defined in the FilterChain class has the call to the RawDataInput feed-in. Tapping the screen, randomizes the filters https://github.com/omarojo/PublicRandomizer/tree/develop – omarojo Mar 29 '17 at 17:05
  • @RhythmicFistman I can't tell from your example if you are using a dispatch serial queue to handle all your processing. That could certainly cause the lag. I think that is what GPUImage2 camera does with `runOperationAsynchronously`. – chourobin Apr 13 '17 at 20:24

1 Answers1

1

I was facing same issue and have spent too much time to resolve it. Finally found the solution. Video frame lag issue is related to Video Stabilisation. Just use this line:

NextLevel.shared.videoStabilizationMode = .off

It's default value is .auto and that's why issue is there.

Hashim Khan
  • 440
  • 1
  • 5
  • 15
  • 1
    Wow how crazy eh. It’s been a while now. We created a new app from scratch using MetalPetal library, and coded our shaders in Metal. GPUImage a thing of the past now. Checkout our app GenerateApp.com – omarojo Sep 14 '21 at 16:11
  • 1
    Great to listen that you have created app from scratch. Congratulations! I replied because I faced same issue and I thought maybe it can be helpful for someone facing same issue. – Hashim Khan Sep 15 '21 at 14:12