I have a video and I'm using a prediction model to get the actions. We can suppose that I'm getting Top 5 predicted action labels for that video.
# Pass the input clip through the model
with torch.no_grad():
prediction = model(video_input.to(device), input_type="video")
# Get the predicted classes
pred_classes = prediction.topk(k=5).indices
# Map the predicted classes to the label names
pred_class_names = [kinetics_id_to_classname[int(i)] for i in pred_classes[0]]
print("Top 5 predicted labels: %s" % ", ".join(pred_class_names))
As output we'll have:
[take plate, spread butter, throw edges, remove plastic, take spoon]
How can I apply Hidden Markov Models to optimize my predictions and use task step prediction?