Troubleshooting StyleGAN3: training model collapse and warnings encountered during evaluation despite using --workers=2 flag

Question

I am attempting to train StyleGAN3 on Google Cloud using the following setup:

2vCPU
13GB of RAM
Nvidia T4
PyTorch 1.13
CUDA 11.3
Python 3.10

My dataset consists of approximately 14,000 JPG images of dresses sketchs, which I converted to PNG format using Pillow Python script.

dataset sample

import os
from PIL import Image

def convert_images(input_folder, output_folder):
    os.makedirs(output_folder, exist_ok=True)

    for filename in os.listdir(input_folder):
        if filename.endswith(".jpg"):
            # Open the image
            image_path = os.path.join(input_folder, filename)
            image = Image.open(image_path)

            # Convert to PNG format
            new_filename = os.path.splitext(filename)[0] + ".png"
            output_path = os.path.join(output_folder, new_filename)
            image.save(output_path, "PNG")

            print(f"Converted {filename} to {new_filename}")

    print("Conversion completed.")
input_path = "/home/..../img"
output_path = "/home/..../converted_img"
convert_images(input_path, output_path)

I built the dataset using the following command: python dataset_tool.py --source /home/..../converted_img --dest /home/..../dataset.zip --resolution=256x256

The training process starts with the following command: python train.py --data=/home/..../dataset.zip --outdir=/home/..../training-runs --cfg=stylegan3-t --gpus=1 --batch=32 --gamma=2 --batch-gpu=16 --snap=10 --mirror=1 --workers=2

Initially, everything seems to work fine, but the training process stops earlier than expected, and the evaluation phase begins. Unfortunately, I have been stuck on the evaluation phase for about an hour probably due to a warning message.

waring message displayed

I used --workers=1 but metrics still seems freeze so I have to disable them.

Additionally, I would like to share the training result I obtained.

collapsed generated fake

The model is cleary collapsed.

Could you please provide guidance on how to best present the training result for further analysis and troubleshooting?

Thank you for your assistance.

Troubleshooting StyleGAN3: training model collapse and warnings encountered during evaluation despite using --workers=2 flag

0 Answers0