Deep learning Inference Performance difference between FP16 and FP32

Question

I trained a resnet50 model (Cifar 10 classification task) using amp(Mixed Precision training). and when i did inference with test data, i found that the reduction in inference time between fp16 and fp32. (I changed output type using amp autocast (float 16 <-> float32) )

I wonder if there is a difference in performance when applied in other tasks. I could not found difference in cifar10 classification task. If i wrote my code wrong or have some opinions, please write below!

Thanks to read my lines.

def inference_one_epoch(model, data_loader, device):
    model.eval()
    all_preds, all_labels = [], []
    for i, (imgs, target) in enumerate(data_loader):
        imgs = imgs.to(device)
        target = target.to(device)
    
        with torch.autocast(device_type = 'cuda', **dtype = torch.float16**):
            output = model(imgs)
        
        all_preds += [torch.argmax(output, 1).detach().cpu().numpy()]
        all_labels += [target.detach().cpu().numpy()]

    all_preds = np.concatenate(all_preds)
    all_labels = np.concatenate(all_labels)
    test_acc = (all_preds==all_labels).mean()
    print('Total Test accuracy = {:.4f}'.format(test_acc))

Deep learning Inference Performance difference between FP16 and FP32

0 Answers0