Configs

Sample Train Config

class EraseDiffConfig(BaseConfig):

    def __init__(self, **kwargs):
        self.train_method = "xattn"
        self.alpha = 0.1
        self.epochs = 1
        self.K_steps = 2
        self.lr = 5e-5
        self.model_config_path = current_dir / "model_config.yaml"
        self.ckpt_path = "models/compvis/style50/compvis.ckpt"
        self.raw_dataset_dir = "data/quick-canvas-dataset/sample"
        self.processed_dataset_dir = "mu/algorithms/erase_diff/data"
        self.dataset_type = "unlearncanvas"
        self.template = "style"
        self.template_name = "Abstractionism"
        self.output_dir = "outputs/erase_diff/finetuned_models"
        self.separator = None
        self.image_size = 512
        self.interpolation = "bicubic"
        self.ddim_steps = 50
        self.ddim_eta = 0.0
        self.devices = "0"
        self.use_sample = True
        self.num_workers = 4
        self.pin_memory = True

Sample Model Config

model:

  base_learning_rate: 1.0e-04
  target: stable_diffusion.ldm.models.diffusion.ddpm.LatentDiffusion
  params:
    linear_start: 0.00085
    linear_end: 0.0120
    num_timesteps_cond: 1
    log_every_t: 200
    timesteps: 1000
    first_stage_key: "edited"
    cond_stage_key: "edit"
    image_size: 64
    channels: 4
    cond_stage_trainable: false   # Note: different from the one we trained before
    conditioning_key: crossattn
    monitor: val/loss_simple_ema
    scale_factor: 0.18215
    use_ema: False

    scheduler_config: # 10000 warmup steps
      target: stable_diffusion.ldm.lr_scheduler.LambdaLinearScheduler
      params:
        warm_up_steps: [ 10000 ]
        cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
        f_start: [ 1.e-6 ]
        f_max: [ 1. ]
        f_min: [ 1. ]

    unet_config:
      target: stable_diffusion.ldm.modules.diffusionmodules.openaimodel.UNetModel
      params:
        image_size: 32 # unused
        in_channels: 4
        out_channels: 4
        model_channels: 320
        attention_resolutions: [ 4, 2, 1 ]
        num_res_blocks: 2
        channel_mult: [ 1, 2, 4, 4 ]
        num_heads: 8
        use_spatial_transformer: True
        transformer_depth: 1
        context_dim: 768
        use_checkpoint: True
        legacy: False

    first_stage_config:
      target: stable_diffusion.ldm.models.autoencoder.AutoencoderKL
      params:
        embed_dim: 4
        monitor: val/rec_loss
        ddconfig:
          double_z: true
          z_channels: 4
          resolution: 256
          in_channels: 3
          out_ch: 3
          ch: 128
          ch_mult:
          - 1
          - 2
          - 4
          - 4
          num_res_blocks: 2
          attn_resolutions: []
          dropout: 0.0
        lossconfig:
          target: torch.nn.Identity

    cond_stage_config:
      target: stable_diffusion.ldm.modules.encoders.modules.FrozenCLIPEmbedder

Description of Arguments in train_config.yaml

Training Parameters

train_method: Specifies the method of training for concept erasure.
- Choices: ["noxattn", "selfattn", "xattn", "full", "notime", "xlayer", "selflayer"]
- Example: "xattn"
alpha: Guidance strength for the starting image during training.
- Type: float
- Example: 0.1
epochs: Number of epochs to train the model.
- Type: int
- Example: 1
K_steps: Number of K optimization steps during training.
- Type: int
- Example: 2
lr: Learning rate used for the optimizer during training.
- Type: float
- Example: 5e-5

Model Configuration

model_config_path: File path to the Stable Diffusion model configuration YAML file.
- type: str
- Example: "/path/to/model_config.yaml"
ckpt_path: File path to the checkpoint of the Stable Diffusion model.
- Type: str
- Example: "/path/to/model_checkpoint.ckpt"

Dataset Directories

raw_dataset_dir: Directory containing the raw dataset categorized by themes or classes.
- Type: str
- Example: "/path/to/raw_dataset"
processed_dataset_dir: Directory to save the processed dataset.
- Type: str
- Example: "/path/to/processed_dataset"
dataset_type: Specifies the dataset type for the training process. Use generic as type if you want to use your own dataset.
- Choices: ["unlearncanvas", "i2p", "generic"]
- Example: "unlearncanvas"
template: Type of template to use during training.
- Choices: ["object", "style", "i2p"]
- Example: "style"
template_name: Name of the specific concept or style to be erased.
- Choices: ["self-harm", "Abstractionism"]
- Example: "Abstractionism"

Output Configurations

output_dir: Directory where the fine-tuned models and results will be saved.
- Type: str
- Example: "outputs/erase_diff/finetuned_models"
separator: String separator used to train multiple words separately, if applicable.
- Type: str or null
- Example: null

Sampling and Image Configurations

image_size: Size of the training images (height and width in pixels).
- Type: int
- Example: 512
interpolation: Interpolation method used for image resizing.
- Choices: ["bilinear", "bicubic", "lanczos"]
- Example: "bicubic"
ddim_steps: Number of DDIM inference steps during training.
- Type: int
- Example: 50
ddim_eta: DDIM eta parameter for stochasticity during sampling.
- Type: float
- Example: 0.0

Device Configuration

devices: Specifies the CUDA devices to be used for training (comma-separated).
- Type: str
- Example: "0"

Additional Flags

use_sample: Flag to indicate whether a sample dataset should be used for training.
- Type: bool
- Example: True
num_workers: Number of worker threads for data loading.
- Type: int
- Example: 4
pin_memory: Flag to enable pinning memory during data loading for faster GPU transfers.
- Type: bool
- Example: true