0

I'm adding an an m4a audio file from the file system, loaded via an AVURLAsset, into an AVMutableComposition. If the loaded asset has a duration of 1s, adding it to the AVMutableComposition results in the composition also having a duration of 1s. This makes sense. However, after exporting the composition to a new file using AVAssetExportSession, the resulting m4a file has a duration about 0.05s less than the duration of the composition (0.95s vs. 1s).

This bug only happens when working with m4a files. If I work with caf, there's no difference in the duration of the exported file vs. the composition.

I originally discovered this bug when I was combining an existing audio file with a new audio file. Despite the AVMutableComposition reporting the correct duration, the file exported by AVAssetExportSession was ~0.05s shorter in duration. To simplify this question, I've removed the code that combines 2 existing audio files together, and made it so that we simply insert a new audio file into an empty mutable composition. The bug still occurs even for this simple case.

Here's my mutable composition code:

    let composition = AVMutableComposition()
    guard
      let compositionTrack = composition.addMutableTrack(
        withMediaType: .audio,
        preferredTrackID: kCMPersistentTrackID_Invalid)
    else
    {
      fatalError("Could not create an AVMutableCompositionTrack.")
    }

    let newAudioAsset = AVURLAsset(
      url: newAudioFileURL, 
      options: [AVURLAssetPreferPreciseDurationAndTimingKey: true])

    guard let newAudioTrack = newAudioAsset.tracks(withMediaType: .audio).first else {
      fatalError("Could not get an audio track for the new audio to be inserted.")
    }

    // Insert the new audio.
    try compositionTrack.insertTimeRange(
      CMTimeRangeMake(start: .zero, duration: newAudioAsset.duration),
      of: newAudioTrack,
      at: .zero)
    }

Here's my export code:

    guard
      let exportSession = AVAssetExportSession(
        asset: composition,
        presetName: AVAssetExportPresetAppleM4A)
    else
    {
      fatalError("Could not create an AVAssetExploreSession.")
    }

    self.exportSession = exportSession

    exportSession.outputURL = audioFileURL
    exportSession.outputFileType = .m4a
    exportSession.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
    exportSession.exportAsynchronously(completionHandler: { [weak self, weak exportSession] in
      guard let self = self, let exportSession = exportSession else {
        fatalError()
      }

      switch exportSession.status {
      case .completed:
        // Roughly 0.05s less
        let savedFileAsset = AVURLAsset(
          url: audioFileURL, 
          options: [AVURLAssetPreferPreciseDurationAndTimingKey: true])

        assert(savedFileAsset.duration == composition.duration) // fails; saved file asset is about 0.05s less

        self.exportSession = nil

      default:
        if let error = exportSession.error {
          self.exportSession = nil
          print(error.localizedDescription)
        }
      }
    })
blkhp19
  • 482
  • 1
  • 5
  • 13
  • can you show the code for creating the composition? – Rhythmic Fistman Jan 03 '22 at 10:17
  • @RhythmicFistman I've added mutable composition code, and also simplified the setup - this bug occurs even when adding a single existing m4a file to an empty mutable composition (combining two existing m4a files also reproduces the issue, but that setup is a bit more complicated so I've left it out). – blkhp19 Jan 04 '22 at 00:23
  • `AVAsset.duration` is ambiguous because the asset can have more than one track and even audio-only files can have metadata that affects the duration. Extra ambiguity comes from the non-doco on what `AVAssetExportPresetAppleM4A` actually does and you haven't mentioned the format of your input files (transcoding / resampling can affect duration too). Plus there's that old chestnut `AVURLAssetPreferPreciseDurationAndTimingKey` which sounds like it should fix all your problems, but is a no-op on movs/mp4/m4as and can't fix an ambiguous API. SO - why do you need the durations to exactly match? – Rhythmic Fistman Jan 04 '22 at 15:44
  • @RhythmicFistman I'm saving timestamps and durations of various parts of an audio file, which I could then reference later for audio editing purposes (deleting chunks of audio). When deleting parts of the audio, if the duration that I saved off was shorter than the duration of the actual saved audio track, then the deletion would remove all but 0.05s of audio, when my intention is to delete the entire audio file. I can work around this by just deleting the original audio file if an editing operation results in some very small amount of leftover duration. – blkhp19 Jan 07 '22 at 07:09

0 Answers0