How can you avoid false positives with music identifying algorithms?

Question

I’m a music producer/composer who will be submitting works to new music libraries. In some cases, I’d like to use previous projects as a starting point. So while the result will be new, unique compositions, I want to avoid a scenario where an algorithm might mistake a new song (or song segment) for a previous work.

I’d like to develop some rules of thumb to keep in mind to ensure this doesn’t happen. Specifically, to understand more about how music identifying algorithms work and what combination of parameters need to be different - and to what degrees they need to be different - so as to avoid creating false positive identifications against my other works.

For example:

Imagine “song a” is part of “library a”. Then I create “song b” for “library b”. The arrangement is similar, same instruments are used, same tempo, same key, and mix is essentially the same. But the chord progression and melody are different, though a similar vibe. Could that trigger a false positive?
Or a scenario like above where maybe the instrumentation is similar, but also using some alternate voices (Like an alternate synth patch for the baseline, and similar but different percusssion samples). New key, and a speed increase of 5 bpm. Is that enough to differentiate?
Or imagine a scenario where the bulk of the track is significantly different for all parameters, including a new tempo and key, except there is a 20 second break in the middle that resembles a previous work: an ambient tonal bed with light percussion. The same tonal bed is used, but in the new key and tempo, and the percussion is close to the same. Then a user uses only those 20 seconds in a video. How different would those 20 seconds need to be from the original, and across what parameters, to avoid a false positive?

These examples are just thought experiments to try and understand how it all works. I imagine any new compositions I make should easily be adequately different from previous compositions, and the cumulative differences would easily extend beyond tenants listed in above scenarios.

But given the fact that there are some parameters that could be very similar…(even just from a mix perspective and instruments used), I would like to develop a deeper understanding of what gets analyzed. And consequently, what sort of differences I should ensure remain constant - because it seems to me even 20 seconds of enough similarity could trigger a potential issue.

Thanks!

Ps: Note I welcome any insight offered, and am certainly receptive to the answer being couched in coding language…this is stack exchange after all, and it could be pretty interesting. But at the end of the day, I’m not a coder (though i am coding curious), and need to translate any clarity offered into practical considerations that could be employed from a music production POV. Which is to say, if it’s easy enough to include some language/concepts with that in mind, I’d be very grateful. Parameters like: tempo, key, chord progressions, rhythm elements, frequency considerations, sounds used, overall mix, etc etc. Thanks again!

Can you point to a specific product/algorithm you had in mind when writing this...? Many of the algorithms developed by the biggest players in this industry are proprietary & the source of competitive advantage, which would preclude them being discussed openly in a public forum like Stack Overflow. It's not clear why anyone with the requisite knowledge to answer this question would do so and risk extreme personal legal liability for the disclosure of such trade secrets. I'm also not so sure this is really covered by the spirit of the scope of the site defined in the [help/on-topic]. — esqew, Jan 12 '22 at 20:32
Fair enough. At the moment, I imagine YouTube is probably the most likely platform where an issue could occur. Though I also know distributors such as distrokid are getting in on the tech. And while I recognize theres differences between algos (and am sure they evolve everyday), I’d assumed there was enough of a basis among them that some broad considerations might be drawn. If the request isn’t within the scope of the site, I’m open to suggestions about where else to go. But maybe the consideration might be addressed in a way that evades these concerns? Thanks — Composed, Jan 12 '22 at 21:20
Sorry - however interesting the topic is, this doesn't belong to StackOverflow. This forum is for well-defined coding issues. Please, move this inquiry/discussion to a different forum (there are literally [hundreds of different feeds on StackExchange](https://stackexchange.com/sites), maybe you could try in [SoftwareEngineering](https://softwareengineering.stackexchange.com/) ... — muka.gergely, Jan 12 '22 at 21:29
Understood. Im new to the forum and unsure of how to make such a transfer, will look into how to do it in a little while when im on my desktop. Thanks — Composed, Jan 12 '22 at 22:24
Oh, from what I can tell I’m unable to make the move as a new user without enough rep...and unable to delete as well. If mod and/or other users w/ enough rep agree with the move, please go ahead. Or if anyone can clarify steps I’m able to take, I’m all ears. — Composed, Jan 12 '22 at 23:18
@Composed if you allow me this suggeston: try to reach YouTube itself - see if there are any support form or any other YouTube-oriented support team where you can contact them and share your inquiring - probably [policies about copyright](https://www.youtube.com/howyoutubeworks/policies/copyright/) or similar. Interesting topic, indeed. — Marco Aurelio Fernandez Reyes, Jan 17 '22 at 21:40
@Composed some other resources that might be of assistance... https://support.google.com/assistant/answer/7554088?hl=en&co=GENIE.Platform%3DAndroid - or - https://ai.googleblog.com/2018/09/googles-next-generation-music.html - or - https://www.midomi.com/ — Trentium, Jan 25 '22 at 23:23

score 0 · Answer 1 · answered Jan 25 '22 at 21:16

attempting to actually answer the question, despite the discussion in the comments, I happen to know of the existence of this video by computerphile. at least some of the music matching algorithms out in the wild must be based on that.

P.S. linked is How Shazam Works (Probably!) featuring David Domminney Fowler. I barely remember the details of the video, except its existence, which is why the answer is so bad. edits are welcome.

How can you avoid false positives with music identifying algorithms?

1 Answers1