0

As a collecter I've thousands of audio files which downloaded from podcasting services. All feeds start with a 15 seconds same introduction. That's very annoying for me so I tried crop all of them.

But all of them are not regular. The voiced presentations are the exactly same but some of them...

  • ... are starting at 00:00 or at 00:05 or at any seconds which we don't know
  • ... have not the introduction on startup

I couldn't determine which seconds should crop.

The question: How can we crop the all audio files according to specific audio clip?

In other sayings "detect same part and remove it" ?

RarLines
  • 15
  • 2
  • 5
  • Have you tried comparing two of the intro clips visually? Do they look exactly the same or close to it? If so, then you could perform a simple search. Otherwise, it could be pretty difficult. – jaket May 19 '14 at 23:34
  • It might be helpful to let us know what software platform(s) you'd be comfortable using for the solution. – Ross Bencina May 20 '14 at 10:20
  • Hi @jacket. Two intro clips look exactly the same or close. Dear Ross Bencina You're right. I don't know also which software platform(s) I need. Maybe Audacity. – RarLines May 20 '14 at 11:32
  • @RarLines Stackoverflow is a programming community, I assume that you know how to program a computer to process audio? – Ross Bencina May 20 '14 at 12:24

1 Answers1

0

As I understand it you already have a way to crop the files at a specific point. So the problem boils down to working out where the intro ends in each clip. Here's how I would do it:

  • First, manually isolate the intro audio in a separate file/buffer.
  • For each clip, you need to work out where in the clip the intro audio occurs. Do this by computing a cross-correlation between the intro audio and the main clip. The correct offset will be the one with the highest correlation coefficient. (You could also look for the minimum in a mean-difference, which is equivalent.)
  • Once you know where the intro audio is, you can calculate your crop position.

There are a few obvious optimisations:

  • Only search for the intro audio in the first (say) 30 seconds of each clip.
  • Don't search for the whole intro audio, just the last 1/2 second.
  • If you're not 100% sure that the audio is there, you might want to set a threshold for acceptance.
Ross Bencina
  • 3,822
  • 1
  • 19
  • 33
  • Thanks for reply but stackoverflow blows me up! How can I do "search" and "set a threshold" I couldn't find the required software platforms – RarLines May 20 '14 at 11:35