1

I am running into some problems trying to create a ProRes encoded mov file using the AVFramework framework, and AVAsset.

On OSX 10.10.5, using XCode 7, linking against 10.9 libraries. So far I have managed to create valid ProRes files that contain both video and multiple channels of audio.

( I am creating multiple tracks of uncompressed 48K, 16-bit PCM Audio)

Adding the Video Frames work well, and adding the Audio frames works well, or at least succeeds in the code.

However when i play the file back, it appears as though the audio frames are repeated, in 12,13,14, or 15 frame sequences.

Looking at the wave form, from the *.mov it is easy to see the repeated audio...

That is to say, the first 13 or X video frames all contain exactly the same audio, this is then again repeated for the next X, and then again and again and again etc...

The Video is fine, it is just the Audio that appears to be looping/repeating.

The issue appears no matter how many audio channels/ tracks I use as the source, I have tested using just 1 track and also using 4 and 8 tracks.

It is independent of what format and amount of samples i feed to the system, ie using, 720p60, 1080p23, and 1080i59 all exhibit the same incorrect behavior.

  • well actually the 720p captures appears to repeat the audio frames 30 or 31 times, and the 1080 formats only repeat the audio frames 12 or 13 times,

But i am definitely submitting different audio data to the Audio encode/SampleBuffer create process, as i have logged this in great detail ( tho it is not shown in the code below)

I have tried a number of different things to modify the code and expose the issue, but had no success, hence i am asking here, and hopefully someone can either see an issue with my code or give me some info with regards to this problem.

The code i am using is as follows:

int main(int argc, const char * argv[])
{
    @autoreleasepool
    {
        NSLog(@"Hello, World!  - Welcome to the ProResCapture With Audio sample app. ");
        OSStatus status;
        AudioStreamBasicDescription audioFormat;
        CMAudioFormatDescriptionRef audioFormatDesc;

        // OK so lets include the hardware stuff first and then we can see about doing some actual capture  and compress stuff
        HARDWARE_HANDLE pHardware = sdiFactory();
        if (pHardware)
        {
            unsigned long ulUpdateType = UPD_FMT_FRAME;
            unsigned long ulFieldCount = 0;
            unsigned int numAudioChannels = 4; //8; //4;
            int numFramesToCapture = 300;

            gBFHancBuffer = (unsigned int*)myAlloc(gHANC_SIZE);

            int audioSize = 2002 * 4 * 16;
            short* pAudioSamples = (short*)new char[audioSize];
            std::vector<short*> vecOfNonInterleavedAudioSamplesPtrs;
            for (int i = 0; i < 16; i++)
            {
                vecOfNonInterleavedAudioSamplesPtrs.push_back((short*)myAlloc(2002 * sizeof(short)));
            }

            bool bVideoModeIsValid = SetupAndConfigureHardwareToCaptureIncomingVideo();

            if (bVideoModeIsValid)
            {

                gBFBytes = (BLUE_UINT32*)myAlloc(gGoldenSize);

                bool canAddVideoWriter = false;
                bool canAddAudioWriter = false;
                int nAudioSamplesWritten = 0;

                // declare the vars for our various AVAsset elements
                AVAssetWriter* assetWriter = nil;
                AVAssetWriterInput* assetWriterInputVideo = nil;
                AVAssetWriterInput* assetWriterAudioInput[16];


                AVAssetWriterInputPixelBufferAdaptor* adaptor = nil;
                NSURL* localOutputURL = nil;
                NSError* localError = nil;

                // create the file we are goijmng to be writing to
                localOutputURL = [NSURL URLWithString:@"file:///Volumes/Media/ProResAVCaptureAnyFormat.mov"];

                assetWriter = [[AVAssetWriter alloc] initWithURL: localOutputURL fileType:AVFileTypeQuickTimeMovie error:&localError];
                if (assetWriter)
                {
                    assetWriter.shouldOptimizeForNetworkUse = NO;

                    // Lets configure the Audio and Video settings for this writer...
                    {
                          // Video First.

                          // Add a video input
                          // create a dictionary with the settings we want ie. Prores capture and width and height.
                          NSMutableDictionary* videoSettings = [NSMutableDictionary dictionaryWithObjectsAndKeys:
                                                                AVVideoCodecAppleProRes422, AVVideoCodecKey,
                                                                [NSNumber numberWithInt:width], AVVideoWidthKey,
                                                                [NSNumber numberWithInt:height], AVVideoHeightKey,
                                                                nil];

                          assetWriterInputVideo = [AVAssetWriterInput assetWriterInputWithMediaType: AVMediaTypeVideo outputSettings:videoSettings];
                          adaptor = [AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:assetWriterInputVideo
                                                                                                     sourcePixelBufferAttributes:nil];

                          canAddVideoWriter = [assetWriter canAddInput:assetWriterInputVideo];
                    }

                    { // Add a Audio AssetWriterInput

                          // Create a dictionary with the settings we want ie. Uncompressed PCM audio 16 bit little endian.
                          NSMutableDictionary* audioSettings = [NSMutableDictionary dictionaryWithObjectsAndKeys:
                                                                [NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey,
                                                                [NSNumber numberWithFloat:48000.0], AVSampleRateKey,
                                                                [NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
                                                                [NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
                                                                [NSNumber numberWithBool:NO], AVLinearPCMIsFloatKey,
                                                                [NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
                                                                [NSNumber numberWithUnsignedInteger:1], AVNumberOfChannelsKey,
                                                                nil];

                          // OR use... FillOutASBDForLPCM(AudioStreamBasicDescription& outASBD, Float64 inSampleRate, UInt32 inChannelsPerFrame, UInt32 inValidBitsPerChannel, UInt32 inTotalBitsPerChannel, bool inIsFloat, bool inIsBigEndian, bool inIsNonInterleaved = false)
                          UInt32 inValidBitsPerChannel = 16;
                          UInt32 inTotalBitsPerChannel = 16;
                          bool inIsFloat = false;
                          bool inIsBigEndian = false;
                          UInt32 inChannelsPerTrack = 1;
                          FillOutASBDForLPCM(audioFormat, 48000.00, inChannelsPerTrack, inValidBitsPerChannel, inTotalBitsPerChannel, inIsFloat, inIsBigEndian);

                          status = CMAudioFormatDescriptionCreate(kCFAllocatorDefault,
                                                                  &audioFormat,
                                                                  0,
                                                                  NULL,
                                                                  0,
                                                                  NULL,
                                                                  NULL,
                                                                  &audioFormatDesc
                                                                  );

                          for (int t = 0; t < numAudioChannels; t++)
                          {
                              assetWriterAudioInput[t] = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeAudio outputSettings:audioSettings];
                              canAddAudioWriter = [assetWriter canAddInput:assetWriterAudioInput[t] ];

                              if (canAddAudioWriter)
                              {
                                  assetWriterAudioInput[t].expectsMediaDataInRealTime = YES; //true;
                                  [assetWriter addInput:assetWriterAudioInput[t] ];
                              }
                          }


                          CMFormatDescriptionRef myFormatDesc = assetWriterAudioInput[0].sourceFormatHint;
                          NSString* medType = [assetWriterAudioInput[0] mediaType];
                    }

                    if(canAddVideoWriter)
                    {
                          // tell the asset writer to expect media in real time.
                          assetWriterInputVideo.expectsMediaDataInRealTime = YES; //true;

                          // add the Input(s)
                          [assetWriter addInput:assetWriterInputVideo];

                          // Start writing the frames..
                          BOOL success = true;
                          success = [assetWriter startWriting];
                          CMTime startTime = CMTimeMake(0, fpsRate);
                          [assetWriter startSessionAtSourceTime:kCMTimeZero];
                          // [assetWriter startSessionAtSourceTime:startTime];

                      if (success)
                      {
                          startOurVideoCaptureProcess();

                          // **** possible enhancement is to use a pixelBufferPool to manage multiple buffers at once...
                          CVPixelBufferRef buffer = NULL;
                          int kRecordingFPS = fpsRate;
                          bool frameAdded = false;
                          unsigned int bufferID;


                          for( int i = 0; i < numFramesToCapture; i++)
                          {
                              printf("\n");

                              buffer = pixelBufferFromCard(bufferID, width, height, memFmt); // This function to get a CVBufferREf From our device, as well as getting the Audio data
                              while(!adaptor.assetWriterInput.readyForMoreMediaData)
                              {
                                    printf(" readyForMoreMediaData FAILED \n");
                              }

                              if (buffer)
                              {
                                  // Add video
                                  printf("appending Frame %d ", i);
                                  CMTime frameTime = CMTimeMake(i, kRecordingFPS);
                                  frameAdded = [adaptor appendPixelBuffer:buffer withPresentationTime:frameTime];
                                  if (frameAdded)
                                      printf("VideoAdded.....\n ");

                                  // Add Audio
                                  {
                                      // Do some Processing on the captured data to extract the interleaved Audio Samples for each channel
                                      struct hanc_decode_struct decode;
                                      DecodeHancFrameEx(gBFHancBuffer, decode);
                                      int nAudioSamplesCaptured = 0;
                                      if(decode.no_audio_samples > 0)
                                      {
                                          printf("completed deCodeHancEX, found %d samples \n", ( decode.no_audio_samples  / numAudioChannels) );
                                          nAudioSamplesCaptured = decode.no_audio_samples  / numAudioChannels;
                                      }

                                      CMTime audioTimeStamp = CMTimeMake(nAudioSamplesWritten, 480000); // (Samples Written) / sampleRate for audio


                                      // This function repacks the Audio from interleaved PCM data a vector of individual array of Audio data
                                      RepackDecodedHancAudio((void*)pAudioSamples, numAudioChannels, nAudioSamplesCaptured, vecOfNonInterleavedAudioSamplesPtrs);

                                      for (int t = 0; t < numAudioChannels; t++)
                                      {
                                          CMBlockBufferRef blockBuf = NULL; // ***********  MUST release these AFTER adding the samples to the assetWriter...
                                          CMSampleBufferRef cmBuf = NULL;

                                          int sizeOfSamplesInBytes = nAudioSamplesCaptured * 2;  // always 16bit memory samples...

                                          // Create sample Block buffer for adding to the audio input.
                                          status = CMBlockBufferCreateWithMemoryBlock(kCFAllocatorDefault,
                                                                                      (void*)vecOfNonInterleavedAudioSamplesPtrs[t],
                                                                                      sizeOfSamplesInBytes,
                                                                                      kCFAllocatorNull,
                                                                                      NULL,
                                                                                      0,
                                                                                      sizeOfSamplesInBytes,
                                                                                      0,
                                                                                      &blockBuf);

                                          if (status != noErr)
                                                NSLog(@"CMBlockBufferCreateWithMemoryBlock error");

                                          status = CMAudioSampleBufferCreateWithPacketDescriptions(kCFAllocatorDefault,
                                                                                                   blockBuf,
                                                                                                   TRUE,
                                                                                                   0,
                                                                                                   NULL,
                                                                                                   audioFormatDesc,
                                                                                                   nAudioSamplesCaptured,
                                                                                                   audioTimeStamp,
                                                                                                   NULL,
                                                                                                   &cmBuf);
                                          if (status != noErr)
                                                NSLog(@"CMSampleBufferCreate error");

                                          // leys check if the CMSampleBuf is valid
                                          bool bValid = CMSampleBufferIsValid(cmBuf);

                                          // examine this values for debugging info....
                                          CMTime cmTimeSampleDuration = CMSampleBufferGetDuration(cmBuf);
                                          CMTime cmTimePresentationTime = CMSampleBufferGetPresentationTimeStamp(cmBuf);

                                          if (status != noErr)
                                              NSLog(@"Invalid Buffer found!!! possible CMSampleBufferCreate error?");


                                          if(!assetWriterAudioInput[t].readyForMoreMediaData)
                                              printf(" readyForMoreMediaData FAILED  - Had to Drop a frame\n");
                                          else
                                          {
                                              if(assetWriter.status == AVAssetWriterStatusWriting)
                                              {
                                                  BOOL r = YES;
                                                  r = [assetWriterAudioInput[t] appendSampleBuffer:cmBuf];
                                                  if (!r)
                                                  {
                                                      NSLog(@"appendSampleBuffer error");
                                                  }
                                                  else
                                                      success = true;

                                              }
                                              else
                                                  printf("AssetWriter Not ready???!? \n");
                                        }

                              if (cmBuf)
                              {
                                  CFRelease(cmBuf);
                                  cmBuf = 0;
                              }
                              if(blockBuf)
                              {
                                  CFRelease(blockBuf);
                                  blockBuf = 0;
                              }
                          }
                          nAudioSamplesWritten = nAudioSamplesWritten + nAudioSamplesCaptured;
                      }

                      if(success)
                      {
                          printf("Audio tracks Added..");
                      }
                      else
                      {
                          NSError* nsERR = [assetWriter error];
                          printf("Problem Adding Audio tracks / samples");
                      }
                      printf("Success \n");
                }


              if (buffer)
              {
                  CVBufferRelease(buffer);
              }
          }
      }
      AVAssetWriterStatus sta = [assetWriter status];
      CMTime endTime = CMTimeMake((numFramesToCapture-1), fpsRate);

      if (audioFormatDesc)
      {
          CFRelease(audioFormatDesc);
          audioFormatDesc = 0;
      }

      // Finish the session
      StopVideoCaptureProcess();
      [assetWriterInputVideo markAsFinished];
      for (int t = 0; t < numAudioChannels; t++)
      {
          [assetWriterAudioInput[t] markAsFinished];
      }

      [assetWriter endSessionAtSourceTime:endTime];


      bool finishedSuccessfully = [assetWriter finishWriting];
      if (finishedSuccessfully)
          NSLog(@"Writing file ended successfully \n");
      else
      {
          NSLog(@"Writing file ended WITH ERRORS...");
          sta = [assetWriter status];
          if (sta != AVAssetWriterStatusCompleted)
          {
              NSError* nsERR = [assetWriter error];
              printf("investoigating the error \n");
          }
      }
                    }
                    else
                    {
      NSLog(@"Unable to Add the InputVideo Asset Writer to the AssetWriter, file will not be written - Exiting");
                    }

                    if (audioFormatDesc)
      CFRelease(audioFormatDesc);
                }


                for (int i = 0; i < 16; i++)
                {
                    if (vecOfNonInterleavedAudioSamplesPtrs[i])
                    {
      bfFree(2002 * sizeof(unsigned short), vecOfNonInterleavedAudioSamplesPtrs[i]);
      vecOfNonInterleavedAudioSamplesPtrs[i] = nullptr;
                    }
                }

            }
            else
            {
                NSLog(@"Unable to find a valid input signal - Exiting");
            }


            if (pAudioSamples)
                delete pAudioSamples;
        }
    }
    return 0;
}

It's a very basic sample that connects to some special hardware ( code for that is left out)

It grabs frames of video and audio, and then there is the processing for the Audio to go from interleaved PCM to the individual Array's of PCM data for each track

and then each buffer is added to the appropriate track, be it video or audio...

Lastly the AvAsset stuff is finished and closed and i exit and clean up.

Any help will be most appreciated,

Cheers,

James

James
  • 67
  • 7
  • Have you tried dumping the raw samples to individual channel files to verify that you are receiving them correctly? – Rhythmic Fistman Dec 19 '16 at 00:23
  • I am dumping the samples ( well a few of them on each channel ) prior to sending to the Asset reader, and i can clearly see that i am passing different samples for each frame. – James Dec 19 '16 at 03:52

1 Answers1

0

Well i finally found a working solution for this problem.

The solution comes in 2 parts:

  1. I moved from using CMAudioSampleBufferCreateWithPacketDescriptions to using CMSampleBufferCreate(..) and the appropriate arguments to that function call.

  2. Initially when experiementing with CMSampleBufferCreate i was mis-using some of the arguments and it was giving me the same results as i initially outlined here, but with careful examination of the values i was passing for the CMSampleTimingInfo struct - specifically the duration part, i eventually got everything working correctly!!

So it appears that i was creating the CMBlockBufferRef correctly, but i needed to take more care when using this to create the CMSampleBufRef that i was passing to the AVAssetWriterInput!

Hope this helps someone else, as it was a nasty one for me to solve!

  • James
James
  • 67
  • 7