1

I have written a python script similar to the shazam app. It captures 15 seconds of audio, and then tries to guess what song in the database it corresponds to. I have stored our database of songs (a dictionary of dictionaries where each key represents the song name) as a pickle file, call it song_db.p since creating the database actually takes several hours with approx 100 songs. When I run my "Shazam app" these are the steps:

  1. the user presses enter when he wishes to begin recording the 15 second sample.
  2. when that is complete I unpickle my database.
  3. I then call a guess_song function to attempt to guess the song.

Step 2 takes about 45 seconds to complete. Not great! What I would like to do is to begin unpickling my database as soon as the user runs the program or alongside the recording (would save 15 seconds at least since that is the recording length) since the two are independent. How could I run the unpickling function in a background process that would run alongside the record song and whatever function in my main function but I do not want to call guess_song until the unpickling is complete.

This is the code I have written and it has gotten me about a 5-10 second boost but I am running in parallel (not really in the background) with the recording. I think I can do better than this for sure.

from multiprocessing import Process

def run_in_parallel(*fns):
    proc = []
    for fn in fns:
        p = Process(target=fn)
        p.start()
        proc.append(p)
    for p in proc:
        p.join()

I the call this in my main function as:

    run_in_parallel(recording_file = record_song(), song_db = get_db())

Where record_song() is self explanatory and get_db performs the unpickling of the database.

user3501476
  • 1,095
  • 2
  • 14
  • 26

0 Answers0