1

I am using PyTorch for custom image classification for 2 classes (like cats and dogs). I have the pretrained model already which classifies the passed image as either cat or dog.

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
class_names = ['dogs', 'cats']
    
def predict_label(image_path, model):
    img = Image.open(image_path)
    img_transformed = transformer(img)
    
    with torch.no_grad():
        model.eval()
        output = model(img_transformed)
        print(output)
        index = output.data.cpu().numpy().argmax()
        return class_names[index]

model = torch.load('dogs_cats')
predict_label("./husky.jpg", model)

I'm getting the following output for model(img_transformed) .

tensor([[ 0.4717, -0.1059]], device='cuda:0')

Is it possible to draw a bounding box on parts of the image based on the class having highest probability? If so, how to get the coordinates of the bounding box?

Surya
  • 971
  • 2
  • 17
  • 30
  • 1
    Unfortunately, what you are doing is image classification---the problem you want to solve is actually [object detection](https://machinelearningmastery.com/object-recognition-with-deep-learning). Image classifiers can only predict probability of the image belonging to N classes (the highest probability class is taken as the prediction). You cannot get any bounding boxes here. – Mercury Nov 20 '22 at 04:58

0 Answers0