Audio Detection for Python

Question

I wanted to make a python program which detects system audio. Just like open-cv match template, is there any module which can detect a certain "characteristic" of an image (or audio for my case) and we can set a threshold value, and if the threshold value is crossed, we can pass a function. Basically, I am making this program for fishing in Minecraft. There is a mod which "pings" me or makes a sound whenever a fish is stuck in the fishing rod. I wanted to input and save the sound in the program (which can be easily done using modules) and see if the audio passes a certain threshold value in order to carry out the process of right clicking the mouse (which can be done by pyautogui module). I want this program to run infinitely until I press a key (which can be done using keyboard.is_pressed("ctrl") ).

I tried googling for a solution but due to my near to no knowledge of sound processing, I was unable to come to a solution. A logic for doing the infinite loop thing was to change the input setting in windows to system sounds and then recording and saving an audio wave file every five seconds and as soon as the file is saved, the program "tries" (try and except block) to delete its previous file in order to save storage and processing the current sound in the same loop.

Following is a sample code I made using open-cv using image detection: (NOTE: You don't necessarily need to look into this code to solve my problem, this is just for an idea and I want a similar code but with audio instead of image detection)

import cv2 as cv
import os
import pyautogui
import time
from PIL import ImageGrab
import keyboard

path_1 = r"C:\Users\Aditya\AppData\Roaming\.minecraft\screenshots\2022-12-21_11.37.40 - Copy.png"   #path_1 is the path of the "template" i.e. the image that I am trying to detect on the screen

threshold = 50672740  #this number was obtained after experimentally testing 

path = "C:\\Users\\Aditya\\Documents\\fishing_pics\\"
#path variable is the path where the screenshots are being saved


n = len(os.listdir(path))+1

while keyboard.is_pressed("ctrl")==False:

    filename = path+str(n)+".png"
    prev2_filename = path+str(n-2)+".png"

    n+=1
    screenshot = ImageGrab.grab()
    screenshot.save(filename,"PNG")

    try:
            os.remove(prev2_filename)
    except:
        print (prev2_filename+" doesn't exists")

    haystack_img = cv.imread(filename,0)
    needle_img = cv.imread(path_1,0)

    haystack_img = cv.cvtColor(haystack_img,cv.COLOR_BGR2GRAY)
    needle_img = cv.cvtColor(needle_img,cv.COLOR_BGR2GRAY)

    result = cv.matchTemplate(needle_img,haystack_img,cv.TM_CCOEFF_NORMED)
    min_val, max_val, min_loc, max_loc = cv.minMaxLoc(result)


    if max_val>=threshold:
        pyautogui.click(button='right')
        time.sleep(1)
        pyautogui.click(button='right')

    time.sleep(0.1)

for file_name in os.listdir(path):
    file = path + file_name
    os.remove(file)

#this is used to remove any file left in the folder where screenshots are saved in order to save strorage

LOGIC >>> Ignore variable "n" because its just used to name the file as 1.png,2.png,etc... I am capturing and saving a picture of my screen using ImageGrab from Pillow (PIL) and then comparing it to the template image in order to get a threshold value and if the threshold value is greater than a certain value (set by me) then it performs a function

Audio Detection for Python

0 Answers0