Sample Train Config

class EraseDiffConfig(BaseConfig):

    def __init__(self, **kwargs):
        self.train_method = "xattn"
        self.alpha = 0.1
        self.epochs = 1
        self.K_steps = 2
        self.lr = 5e-5
        self.model_config_path = current_dir / "model_config.yaml"
        self.ckpt_path = "models/compvis/style50/compvis.ckpt"
        self.raw_dataset_dir = "data/quick-canvas-dataset/sample"
        self.processed_dataset_dir = "mu/algorithms/erase_diff/data"
        self.dataset_type = "unlearncanvas"
        self.template = "style"
        self.template_name = "Abstractionism"
        self.output_dir = "outputs/erase_diff/finetuned_models"
        self.separator = None
        self.image_size = 512
        self.interpolation = "bicubic"
        self.ddim_steps = 50
        self.ddim_eta = 0.0
        self.devices = "0"
        self.use_sample = True
        self.num_workers = 4
        self.pin_memory = True


Sample Model Config

model:

  base_learning_rate: 1.0e-04
  target: stable_diffusion.ldm.models.diffusion.ddpm.LatentDiffusion
  params:
    linear_start: 0.00085
    linear_end: 0.0120
    num_timesteps_cond: 1
    log_every_t: 200
    timesteps: 1000
    first_stage_key: "edited"
    cond_stage_key: "edit"
    image_size: 64
    channels: 4
    cond_stage_trainable: false   # Note: different from the one we trained before
    conditioning_key: crossattn
    monitor: val/loss_simple_ema
    scale_factor: 0.18215
    use_ema: False

    scheduler_config: # 10000 warmup steps
      target: stable_diffusion.ldm.lr_scheduler.LambdaLinearScheduler
      params:
        warm_up_steps: [ 10000 ]
        cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
        f_start: [ 1.e-6 ]
        f_max: [ 1. ]
        f_min: [ 1. ]

    unet_config:
      target: stable_diffusion.ldm.modules.diffusionmodules.openaimodel.UNetModel
      params:
        image_size: 32 # unused
        in_channels: 4
        out_channels: 4
        model_channels: 320
        attention_resolutions: [ 4, 2, 1 ]
        num_res_blocks: 2
        channel_mult: [ 1, 2, 4, 4 ]
        num_heads: 8
        use_spatial_transformer: True
        transformer_depth: 1
        context_dim: 768
        use_checkpoint: True
        legacy: False

    first_stage_config:
      target: stable_diffusion.ldm.models.autoencoder.AutoencoderKL
      params:
        embed_dim: 4
        monitor: val/rec_loss
        ddconfig:
          double_z: true
          z_channels: 4
          resolution: 256
          in_channels: 3
          out_ch: 3
          ch: 128
          ch_mult:
          - 1
          - 2
          - 4
          - 4
          num_res_blocks: 2
          attn_resolutions: []
          dropout: 0.0
        lossconfig:
          target: torch.nn.Identity

    cond_stage_config:
      target: stable_diffusion.ldm.modules.encoders.modules.FrozenCLIPEmbedder

Description of Arguments in train_config.yaml

Training Parameters

  • train_method: Specifies the method of training for concept erasure.

    • Choices: ["noxattn", "selfattn", "xattn", "full", "notime", "xlayer", "selflayer"]
    • Example: "xattn"
  • alpha: Guidance strength for the starting image during training.

    • Type: float
    • Example: 0.1
  • epochs: Number of epochs to train the model.

    • Type: int
    • Example: 1
  • K_steps: Number of K optimization steps during training.

    • Type: int
    • Example: 2
  • lr: Learning rate used for the optimizer during training.

    • Type: float
    • Example: 5e-5

Model Configuration

  • model_config_path: File path to the Stable Diffusion model configuration YAML file.

    • type: str
    • Example: "/path/to/model_config.yaml"
  • ckpt_path: File path to the checkpoint of the Stable Diffusion model.

    • Type: str
    • Example: "/path/to/model_checkpoint.ckpt"

Dataset Directories

  • raw_dataset_dir: Directory containing the raw dataset categorized by themes or classes.

    • Type: str
    • Example: "/path/to/raw_dataset"
  • processed_dataset_dir: Directory to save the processed dataset.

    • Type: str
    • Example: "/path/to/processed_dataset"
  • dataset_type: Specifies the dataset type for the training process. Use generic as type if you want to use your own dataset.

    • Choices: ["unlearncanvas", "i2p", "generic"]
    • Example: "unlearncanvas"
  • template: Type of template to use during training.

    • Choices: ["object", "style", "i2p"]
    • Example: "style"
  • template_name: Name of the specific concept or style to be erased.

    • Choices: ["self-harm", "Abstractionism"]
    • Example: "Abstractionism"

Output Configurations

  • output_dir: Directory where the fine-tuned models and results will be saved.

    • Type: str
    • Example: "outputs/erase_diff/finetuned_models"
  • separator: String separator used to train multiple words separately, if applicable.

    • Type: str or null
    • Example: null

Sampling and Image Configurations

  • image_size: Size of the training images (height and width in pixels).

    • Type: int
    • Example: 512
  • interpolation: Interpolation method used for image resizing.

    • Choices: ["bilinear", "bicubic", "lanczos"]
    • Example: "bicubic"
  • ddim_steps: Number of DDIM inference steps during training.

    • Type: int
    • Example: 50
  • ddim_eta: DDIM eta parameter for stochasticity during sampling.

    • Type: float
    • Example: 0.0

Device Configuration

  • devices: Specifies the CUDA devices to be used for training (comma-separated).

    • Type: str
    • Example: "0"

Additional Flags

  • use_sample: Flag to indicate whether a sample dataset should be used for training.

    • Type: bool
    • Example: True
  • num_workers: Number of worker threads for data loading.

    • Type: int
    • Example: 4
  • pin_memory: Flag to enable pinning memory during data loading for faster GPU transfers.

    • Type: bool
    • Example: true