0

The code gets stuck at pipe.to(device).

from auth_token import auth_token
from fastapi import FastAPI,Response
from fastapi.middleware.cors import CORSMiddleware
import torch
from torch import autocast
from diffusers import StableDiffusionPipeline
from io import BytesIO
import base64

from torch.cuda import empty_cache

app = FastAPI()

app.add_middleware(
    CORSMiddleware,
    allow_credentials= True,
    allow_origins=["*"],
    allow_methods=["*"],
    allow_headers=["*"]
)

device = "cuda"
model_id = "CompVis/stable-diffusion-v1-4"
pipe = StableDiffusionPipeline.from_pretrained(model_id,revision="fp16",torch_dtype=torch.float16,use_auth_token=auth_token)
print(torch.cuda.get_device_properties(0).total_memory)
pipe.to(device)


@app.get("/")
def generate(prompt : str) :
    with autocast(device):
        image = pipe(prompt,guidance_scale=8.5).images[0]

    image.save("testimage.png")
    empty_cache()
 
    return {"out":"hello World"}

I installed all the latest versions of transformers, diffusers and scipy.

I have a GPU with 4 GB vRAM.

pppery
  • 3,731
  • 22
  • 33
  • 46
  • What does your `total_memory` print statement print out? Could you print versions of mentioned libraries? – doneforaiur Aug 10 '23 at 14:09
  • The total_memory is 4294639616 . FastAPI version : Version: 0.101.0 Torch : Version: 2.0.1 Diffusers : Version: 0.19.3 – Nilesh Nath Aug 11 '23 at 00:34

1 Answers1

0

4 GB of vRAM is quite low. Check this post to see improvements. To get it to work under 4 GB, I'd suggest:

pipe.enable_sequential_cpu_offload()
pipe.enable_attention_slicing(1)

Also, you should assign it to the pipe like this:

pipe = pipe.to(device)
doneforaiur
  • 1,308
  • 7
  • 14
  • 21