I'm getting an error when using the Donut Model: Input Type and Bias type should be the same

Asked Apr 22 '23 at 16:21

Active Apr 22 '23 at 16:25

Viewed 230 times

I'm trying to extract text from image using the Donut Model which is an Image Parser. It seems that the input image is not in the proper format.

I'm getting an error that says: RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same on this line:

output = model.inference(image=image, prompt="<s_cord-v2>")

Here is my entire code:

    from donut import DonutModel 
    from PIL import Image 
    import torch 

    model = DonutModel.from_pretrained("naver-clova-ix/donut-base- 
    finetuned-cord-v2") 

    if torch.cuda.is_available():
        model.half()      
        device = torch.device("cuda")      
        model.to(device)  
    else:      
        model.encoder.to(torch.bfloat16) model.eval()  

    image = Image.open("testfolder/test1.jpg").convert("RGB") 
    output = model.inference(image=image, prompt="<s_cord-v2>") 
    output

I understand that image is not in the right format, but how would I go about fixing that?

edited Apr 22 '23 at 16:25

asked Apr 22 '23 at 16:21

Rithwik Babu

I'm getting an error when using the Donut Model: Input Type and Bias type should be the same

0 Answers0