1

I'm using YouTube's "auto-generated" captions feature to generate transcripts of mp3 files. I do this by first converting the mp3 to a blank mp4, uploading to YouTube, waiting for the auto generated captions to appear, then extracting the SRT file.

The issue I'm having though is that a few of the mp3 files I've uploaded have been flagged as having copyrighted content, and as such no auto-generated captions have been made for them.

I have no desire to publish the mp3s on YouTube, they're uploaded as unlisted videos and all I require are the SRT files. Is there a way to manipulate the audio to bypass YouTube's content ID system? I've tried altering the pitch in Audacity, but it doesn't matter how subtle or extreme the pitch change is, they're still flagged as having copyrighted content. Is there anything else I can do to the audio other than adjusting the pitch that might work?

I'm hoping this post doesn't breach any rules on here, and I can't stress enough that I'm not looking to publish these mp3s, I just want the auto-generated SRTs.

Adam
  • 59
  • 2
  • 4

1 Answers1

1

No one can know how to cheat on Content ID

Obviously, as Content ID is a private algorithm developed by Google, no one can know for sure how do they detect copyrighted audio in a video.

But, we can assume that one of the first things they did was to make their algorithm pitch-independent. Otherwise, everyone would change the pitch of their videos and cheat on Content ID easily.

How to use Youtube to get your subtitles anyway

If I am not mistaken, Content ID blocks you because of musical content, rather than vocal content. Thus, to address your original problem, one solution would be to detect musical content (based on spectral analysis) and cut it from the original audio. If the problem is with pure vocal content as well, you could try to filter it heavily and that might work.

Other solutions

Youtube being made by Google, why not using directly the Speech API that Google offers and which most likely perform audio transcription on Youtube? And if results are not satisfying, you could try other services (IBM, Microsoft, Amazon and others have theirs).

filaton
  • 2,257
  • 17
  • 27
  • Thank you. I was unaware of the Speech API, that looks very interesting, I'll give that a try. The MP3s don't actually contain any music, however I believe some of the content was once published in a compilation, which is why they're being flagged. – Adam Aug 23 '17 at 10:28
  • Interesting, I thought *Content ID* was used only for musical content :) And the Speech API is definitely worth a try, and it's really amazingly easy to get started! – filaton Aug 23 '17 at 12:21