3

The documentation on this library is essentially non-existent, so I really need your help here.

Goal: I need H264 encoding (preferably with both audio and video, but just video is fine and I'll just play around a few days to get audio to work too) so I can pass it into a MPEG transport stream.

What I have: I have a camera that records and outputs sample buffers. Inputs are camera back and built-in mic.

A few questions: A. Is it possible to get the camera to output CMSampleBuffers in H264 format?I mean, the 2014 has it being produced from VTCompressionSessions but while writing my captureOutput, I see that I already get a CMSampleBuffer... B. How do I set up a VTCompressionSession? How is the session used? Some overarching top-level discussion about this might help people understand what's actually going on in this barely documented library.

Code here (please ask for more if you need it; I'm only putting captureOutput because I don't know how relevant the rest of the code is):

func captureOutput(captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, fromConnection connection: AVCaptureConnection!) {
    println(CMSampleBufferGetFormatDescription(sampleBuffer))
    var imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
    if imageBuffer != nil {
        var pixelBuffer = imageBuffer as CVPixelBufferRef
        var timeStamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer as CMSampleBufferRef)
        //Do some VTCompressionSession stuff
    }
}

Thanks all!

dcheng
  • 1,827
  • 1
  • 11
  • 20

1 Answers1

0

First initialise the VTCompression session and set its properties

    NSDictionary* bAttributes= @{};

    VTCompressionSessionRef vtComp;
    OSStatus result = VTCompressionSessionCreate(NULL,
        trackSize.width,
        trackSize.height,
        kCMVideoCodecType_H264,
        NULL,
        (CFDictionaryRef)bAttributes,
        NULL,
        compressCallback,
        NULL,
        &vtComp);
    NSLog(@"create VTCS Status: %d",result);
    
    NSDictionary* compProperties = @{ 
        (id)kVTCompressionPropertyKey_ProfileLevel: (id)kVTProfileLevel_H264_High_AutoLevel,
        (id)kVTCompressionPropertyKey_H264EntropyMode: (id)kVTH264EntropyMode_CABAC,
        (id)kVTCompressionPropertyKey_Quality: @(0.95),
        (id)kVTCompressionPropertyKey_RealTime: @(YES),
        (id)kVTVideoEncoderSpecification_EnableHardwareAcceleratedVideoEncoder: @(YES),
        (id)kVTVideoEncoderSpecification_RequireHardwareAcceleratedVideoEncoder: @(YES)
    };

    result=VTSessionSetProperties(vtComp,(CFDictionaryRef)compProperties);

the compressCallback is your method which is called when compressed data is available. And looks something like this;

void compressCallback(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer)
{
    AVAssetWriterInput* aw = (AVAssetWriterInput*)sourceFrameRefCon;    
    [aw appendSampleBuffer:sampleBuffer];
}

Then you have your read/compress loop. You obtain a CVImage buffer from your CMSample buffer and pass that to the compressor.

        CVPixelBufferRef buffer = CMSampleBufferGetImageBuffer(cmbuf);
        VTEncodeInfoFlags encodeResult;

        result = VTCompressionSessionEncodeFrame (vtComp,
            buffer, 
            currentTime,
            frameDuration,
            NULL, // frameProperties
            writerInput, // opaque context to callback
            &encodeResult);

obviously you'll need to check status and return values but this should get you looking in the right direction.

silicontrip
  • 906
  • 7
  • 23