I am using person-detection-action-recognition-0005 pre-trained model from openvino to detect the person and their action.
From the above link, I wrote a python script to get detections.
This is the script.
import cv2
def main():
print(cv2.__file__)
frame = cv2.imread('/home/naveen/Downloads/person.jpg')
actionNet = cv2.dnn.readNet('person-detection-action-recognition-0005.bin',
'person-detection-action-recognition-0005.xml')
actionBlob = cv2.dnn.blobFromImage(frame, size=(680, 400))
actionNet.setInput(actionBlob)
# detection output
actionOut = actionNet.forward(['mbox_loc1/out/conv/flat',
'mbox_main_conf/out/conv/flat/softmax/flat',
'out/anchor1','out/anchor2',
'out/anchor3','out/anchor4'])
# this is the part where I dont know how to get person bbox
# and action label for those person fro actionOut
for detection in actionOut[2].reshape(-1, 3):
print('sitting ' +str( detection[0]))
print('standing ' +str(detection[1]))
print('raising hand ' +str(detection[2]))
Now, I don't know how to get bbox and action label from the output variable(actionOut). I am unable to find any documentation or blog explaining this.
Does someone have any idea or suggestion, how it can be done?