So the main goal is:
- Loading safetensors checkpoint file on stable diffusion
- secondary goal = safe loading the .yaml model config
Here's the google colab I was working with: https://colab.research.google.com/github/deforum-art/deforum-stable-diffusion/blob/main/Deforum_Stable_Diffusion.ipynb
github page = https://github.com/deforum/deforum-stable-diffusion
With regular stable diffusion I'm able to add safetensor checkpoint files to replace the ckpt files, however I think the issue I'm having with deforum is the .yaml file.. I heard that you're supposed to safe_load() these guys otherwise there's a possibility of it running arbitrary code, similar to the pickle issue in ckpt files.
I tried renaming the file to v1-inference.yaml.safe_load(), but that just looks wrong and totally didn't work
I tried to just not give a path to a custom .yaml file but then the error is "where is this custom file?"
I can find example of how to do this however, not any specific examples of how it works with deforum.
Maybe I'm just freaking out over nothing, here's the source code for the v1-inference.yaml file, does this look suspicious or seem like it would run background remote arbitrary code execution?
model:
base_learning_rate: 1.0e-04
target: ldm.models.diffusion.ddpm.LatentDiffusion
params:
linear_start: 0.00085
linear_end: 0.0120
num_timesteps_cond: 1
log_every_t: 200
timesteps: 1000
first_stage_key: "jpg"
cond_stage_key: "txt"
image_size: 64
channels: 4
cond_stage_trainable: false # Note: different from the one we trained before
conditioning_key: crossattn
monitor: val/loss_simple_ema
scale_factor: 0.18215
use_ema: False
scheduler_config: # 10000 warmup steps
target: ldm.lr_scheduler.LambdaLinearScheduler
params:
warm_up_steps: [ 10000 ]
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
f_start: [ 1.e-6 ]
f_max: [ 1. ]
f_min: [ 1. ]
unet_config:
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
params:
image_size: 32 # unused
in_channels: 4
out_channels: 4
model_channels: 320
attention_resolutions: [ 4, 2, 1 ]
num_res_blocks: 2
channel_mult: [ 1, 2, 4, 4 ]
num_heads: 8
use_spatial_transformer: True
transformer_depth: 1
context_dim: 768
use_checkpoint: True
legacy: False
first_stage_config:
target: ldm.models.autoencoder.AutoencoderKL
params:
embed_dim: 4
monitor: val/rec_loss
ddconfig:
double_z: true
z_channels: 4
resolution: 256
in_channels: 3
out_ch: 3
ch: 128
ch_mult:
- 1
- 2
- 4
- 4
num_res_blocks: 2
attn_resolutions: []
dropout: 0.0
lossconfig:
target: torch.nn.Identity
cond_stage_config:
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
Honestly, it seems harmless but I dunno maybe i'm missing something. Reading too much cyber security stuff makes you super paranoid about everything. Thanks!