0

So the main goal is:

  • Loading safetensors checkpoint file on stable diffusion
  • secondary goal = safe loading the .yaml model config

Here's the google colab I was working with: https://colab.research.google.com/github/deforum-art/deforum-stable-diffusion/blob/main/Deforum_Stable_Diffusion.ipynb

github page = https://github.com/deforum/deforum-stable-diffusion

With regular stable diffusion I'm able to add safetensor checkpoint files to replace the ckpt files, however I think the issue I'm having with deforum is the .yaml file.. I heard that you're supposed to safe_load() these guys otherwise there's a possibility of it running arbitrary code, similar to the pickle issue in ckpt files.

I tried renaming the file to v1-inference.yaml.safe_load(), but that just looks wrong and totally didn't work

I tried to just not give a path to a custom .yaml file but then the error is "where is this custom file?"

I can find example of how to do this however, not any specific examples of how it works with deforum.

Maybe I'm just freaking out over nothing, here's the source code for the v1-inference.yaml file, does this look suspicious or seem like it would run background remote arbitrary code execution?

model:
  base_learning_rate: 1.0e-04
  target: ldm.models.diffusion.ddpm.LatentDiffusion
  params:
    linear_start: 0.00085
    linear_end: 0.0120
    num_timesteps_cond: 1
    log_every_t: 200
    timesteps: 1000
    first_stage_key: "jpg"
    cond_stage_key: "txt"
    image_size: 64
    channels: 4
    cond_stage_trainable: false   # Note: different from the one we trained before
    conditioning_key: crossattn
    monitor: val/loss_simple_ema
    scale_factor: 0.18215
    use_ema: False

    scheduler_config: # 10000 warmup steps
      target: ldm.lr_scheduler.LambdaLinearScheduler
      params:
        warm_up_steps: [ 10000 ]
        cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
        f_start: [ 1.e-6 ]
        f_max: [ 1. ]
        f_min: [ 1. ]

    unet_config:
      target: ldm.modules.diffusionmodules.openaimodel.UNetModel
      params:
        image_size: 32 # unused
        in_channels: 4
        out_channels: 4
        model_channels: 320
        attention_resolutions: [ 4, 2, 1 ]
        num_res_blocks: 2
        channel_mult: [ 1, 2, 4, 4 ]
        num_heads: 8
        use_spatial_transformer: True
        transformer_depth: 1
        context_dim: 768
        use_checkpoint: True
        legacy: False

    first_stage_config:
      target: ldm.models.autoencoder.AutoencoderKL
      params:
        embed_dim: 4
        monitor: val/rec_loss
        ddconfig:
          double_z: true
          z_channels: 4
          resolution: 256
          in_channels: 3
          out_ch: 3
          ch: 128
          ch_mult:
          - 1
          - 2
          - 4
          - 4
          num_res_blocks: 2
          attn_resolutions: []
          dropout: 0.0
        lossconfig:
          target: torch.nn.Identity

    cond_stage_config:
      target: ldm.modules.encoders.modules.FrozenCLIPEmbedder

Honestly, it seems harmless but I dunno maybe i'm missing something. Reading too much cyber security stuff makes you super paranoid about everything. Thanks!

0 Answers0