21

I'm developing an iOS application with SDK for iOS 5.0 and XCode 4.2.

I want to develop an application that recognize sounds. I see there is an application called Sound Hound that recognize music and tells artist and title.

How can I do something similar? I want to compare a sound to an existing sound database. How can I do that?

Maybe I can use Fourier Transform. I don't know how to process sounds. Or it could be similar to speech recognition, isn't it?

VansFannel
  • 45,055
  • 107
  • 359
  • 626
  • 6
    It's a fairly simple algorithm, however the real key to the application is the fact that the algorithm is patented, so if you try to implement it yourself, vampire-teethed lawyers will appear from thin air and suck the life out of you... ;) – Lindydancer Mar 20 '12 at 06:22
  • Well, I don't want that. I want to know how can I compare sounds. Using Fourier Transform? – VansFannel Mar 20 '12 at 06:24
  • 1
    This could help: http://gizmodo.com/5647458/how-shazam-works-to-identify-nearly-every-song-you-throw-at-it – VansFannel Mar 20 '12 at 06:25
  • 2
    Thanks for voting to close and don't say why. – VansFannel Mar 20 '12 at 06:39
  • Another interesting article: http://www.codeproject.com/Articles/32172/FFT-Guitar-Tuner – VansFannel Mar 20 '12 at 06:42

3 Answers3

21

I came across a paper which explains how audio search algorithms work. Here is the link. It was written by one of the developers of Shazam, a rival application of SoundHound.

Subodh
  • 2,204
  • 18
  • 22
7

good links on the wikipedia page include: https://surdu.me/2011/01/20/how-does-shazam-work.html and the paper http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf Sub_stantial noted earlier

Nicu Surdu
  • 8,172
  • 9
  • 68
  • 108
Michael Levy
  • 13,097
  • 15
  • 66
  • 100
1

Shazam Application is the one of the best Example for Application in Linked open data it takes the short sample music from the end user and identifies the song from datasets as well as link to purchase the album.

The user tags a song for 10 seconds and the application creates an audio fingerprint based on some of the anchors of the simplified spectrogram and the target area between them.

For each point of the target area, they create a hash value that is the combination of the frequency at which the anchor point is located, the frequency at which the point in the target zone is located, and the time difference between the point in the target zone and when the anchor point is located in the song.

Once the fingerprint of the audio is created, Shazam starts the search for matches in the database. If there is a match, the information is returned to the user; otherwise it returns a “song not known” dialogue.

Akash Soni
  • 535
  • 2
  • 16