I have collected a set of images and separate text files with caption for each image. The text files have the same name as the image file, but with a different extension (txt). Then I try to train using default instructions for training:
export MODEL_NAME="stabilityai/stable-diffusion-2-1"
export INSTANCE_DIR="/path/to/images"
export OUTPUT_DIR="/output/path/"
accelerate launch train_dreambooth.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--instance_data_dir=$INSTANCE_DIR \
--output_dir=$OUTPUT_DIR \
--instance_prompt="a painting by [s]" \
--resolution=768 \
--train_batch_size=1 \
--gradient_accumulation_steps=1 --gradient_checkpointing \
--use_8bit_adam \
--enable_xformers_memory_efficient_attention \
--set_grads_to_none \
--learning_rate=2e-6 \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--max_train_steps=800
But sample script considers text files as images and razes error that they are not images
UnidentifiedImageError: cannot identify image file
'/path/to/images/00171-0-20181124_131525.txt'
How to caption the images properly?