Sample config for Advattack (mu_defense)

#mu_defense/algorithms/adv_unlearn/configs/adv_unlearn_config.py

import os
from pathlib import Path
from mu_defense.core.base_config import BaseConfig


class AdvUnlearnConfig(BaseConfig):
    def __init__(self, **kwargs):
        # Inference & Model Paths
        self.model_config_path = "configs/stable-diffusion/v1-inference.yaml" #for compvis
        self.compvis_ckpt_path = "models/sd-v1-4-full-ema.ckpt"
        self.encoder_model_name_or_path = "CompVis/stable-diffusion-v1-4"
        self.cache_path = ".cache"

        self.diffusers_model_name_or_path = ""
        self.target_ckpt = None #Optionally load a target checkpoint into model for diffuser sampling

        # Devices & IO
        self.devices = "0,0"  # You can later parse this string into a list if needed.
        self.seperator = None
        self.output_dir = "outputs/adv_unlearn"

        # Image & Diffusion Sampling
        self.image_size = 512
        self.ddim_steps = 50
        self.start_guidance = 3.0
        self.negative_guidance = 1.0

        # Training Setup
        self.prompt = "nudity"
        self.dataset_retain = "coco_object"  # Choices: 'coco_object', 'coco_object_no_filter', 'imagenet243', 'imagenet243_no_filter'
        self.retain_batch = 5
        self.retain_train = "iter"  # Options: 'iter' or 'reg'
        self.retain_step = 1
        self.retain_loss_w = 1.0
        self.ddim_eta = 0

        self.train_method = "text_encoder_full"   #choices: text_encoder_full', 'text_encoder_layer0', 'text_encoder_layer01', 'text_encoder_layer012', 'text_encoder_layer0123', 'text_encoder_layer01234', 'text_encoder_layer012345', 'text_encoder_layer0123456', 'text_encoder_layer01234567', 'text_encoder_layer012345678', 'text_encoder_layer0123456789', 'text_encoder_layer012345678910', 'text_encoder_layer01234567891011', 'text_encoder_layer0_11','text_encoder_layer01_1011', 'text_encoder_layer012_91011', 'noxattn', 'selfattn', 'xattn', 'full', 'notime', 'xlayer', 'selflayer
        self.norm_layer = False  # This is a flag; use True if you wish to update the norm layer.
        self.attack_method = "pgd"  # Choices: 'pgd', 'multi_pgd', 'fast_at', 'free_at'
        self.component = "all"     # Choices: 'all', 'ffn', 'attn'
        self.iterations = 10
        self.save_interval = 200
        self.lr = 1e-5

        # Adversarial Attack Hyperparameters
        self.adv_prompt_num = 1
        self.attack_embd_type = "word_embd"  # Choices: 'word_embd', 'condition_embd'
        self.attack_type = "prefix_k"         # Choices: 'replace_k', 'add', 'prefix_k', 'suffix_k', 'mid_k', 'insert_k', 'per_k_words'
        self.attack_init = "latest"           # Choices: 'random', 'latest'
        self.attack_step = 30
        self.attack_init_embd = None
        self.adv_prompt_update_step = 1
        self.attack_lr = 1e-3
        self.warmup_iter = 200

        #backend
        self.backend = "compvis"

        # Override default values with any provided keyword arguments.
        for key, value in kwargs.items():
            setattr(self, key, value)

    def validate_config(self):
        """
        Perform basic validation on the config parameters.
        """
        if self.retain_batch <= 0:
            raise ValueError("retain_batch should be a positive integer.")
        if self.lr <= 0:
            raise ValueError("Learning rate (lr) should be positive.")
        if self.image_size <= 0:
            raise ValueError("Image size should be a positive integer.")
        if self.iterations <= 0:
            raise ValueError("Iterations must be a positive integer.")
        if not os.path.exists(self.output_dir):
            os.makedirs(self.output_dir)

adv_unlearn_config = AdvUnlearnConfig()

Description of fields in config file

Below is a detailed description of the configuration fields available in the adv_unlearn_config.py file. The descriptions match those provided in the help section of the command-line arguments.

  1. Inference & Model Paths

  2. model_config_path
    Description: Config path for stable diffusion model. Use for compvis model only. Type: str
    Example: configs/stable-diffusion/v1-inference.yaml

  3. compvis_ckpt_path
    Description: Checkpoint path for stable diffusion v1-4.
    Type: str
    Example: models/sd-v1-4-full-ema.ckpt

  4. encoder_model_name_or_path
    Description: Model name or path for the encoder. Type: str
    Example: CompVis/stable-diffusion-v1-4

  5. cache_path
    Description: Directory used for caching model files.
    Type: str
    Example: .cache

  6. diffusers_model_name_or_path
    Description: Model name or path for the diffusers (if used).
    Type: str
    Example: outputs/forget_me_not/finetuned_models/Abstractionism

  7. target_ckpt
    Description: Optionally load a target checkpoint into the model for diffuser sampling.
    Type: Typically str or None
    Example: path to target checkpoint path

  8. Devices & IO

  9. devices
    Description: CUDA devices to train on.
    Type: str
    Example: 0,0

  10. seperator
    Description: Separator used if you want to train a bunch of words separately.
    Type: str or None
    Example: None

  11. output_dir
    Description: Directory where output files (e.g., checkpoints, logs) are saved.
    Type: str
    Example: outputs/adv_unlearn

  12. Image & Diffusion Sampling

  13. image_size
    Description: Image size used to train.
    Type: int
    Example: 512

  14. ddim_steps
    Description: Number of DDIM steps for inference during training.
    Type: int
    Example: 50

  15. start_guidance
    Description: Guidance of start image used to train.
    Type: float
    Example: 3.0

  16. negative_guidance
    Description: Guidance of negative training used to train.
    Type: float
    Example: 1.0

  17. ddim_eta
    Description: DDIM eta parameter for sampling.
    Type: int or float
    Example: 0

  18. Training Setup

  19. prompt
    Description: Prompt corresponding to the concept to erase.
    Type: str
    Example: nudity

  20. dataset_retain
    Description: Prompts corresponding to non-target concepts to retain.
    Type: str
    Choices: coco_object, coco_object_no_filter, imagenet243, imagenet243_no_filter
    Example: coco_object

  21. retain_batch
    Description: Batch size of retaining prompts during training.
    Type: int
    Example: 5

  22. retain_train
    Description: Retaining training mode; choose between iterative (iter) or regularization (reg).
    Type: str
    Choices: iter, reg
    Example: iter

  23. retain_step
    Description: Number of steps for retaining prompts.
    Type: int
    Example: 1

  24. retain_loss_w
    Description: Retaining loss weight.
    Type: float
    Example: 1.0

  25. train_method
    Description: Method of training.
    Type: str
    Choices:
    text_encoder_full, text_encoder_layer0, text_encoder_layer01, text_encoder_layer012, text_encoder_layer0123, text_encoder_layer01234, text_encoder_layer012345, text_encoder_layer0123456, text_encoder_layer01234567, text_encoder_layer012345678, text_encoder_layer0123456789, text_encoder_layer012345678910, text_encoder_layer01234567891011, text_encoder_layer0_11, text_encoder_layer01_1011, text_encoder_layer012_91011, noxattn, selfattn, xattn, full, notime, xlayer, selflayer
    Example: text_encoder_full

  26. norm_layer
    Description: Flag indicating whether to update the norm layer during training.
    Type: bool
    Example: False

  27. attack_method
    Description: Method for adversarial attack training.
    Type: str
    Choices: pgd, multi_pgd, fast_at, free_at
    Example: pgd

  28. component
    Description: Component to apply the attack on.
    Type: str
    Choices: all, ffn, attn
    Example: all

  29. iterations
    Description: Total number of training iterations.
    Type: int
    Example: 10
    (Note: The help argument may default to a higher value, e.g., 1000, but the config file sets it to 10.)

  30. save_interval
    Description: Interval (in iterations) at which checkpoints are saved.
    Type: int
    Example: 200

  31. lr
    Description: Learning rate used during training.
    Type: float
    Example: 1e-5

  32. Adversarial Attack Hyperparameters

  33. adv_prompt_num
    Description: Number of prompt tokens for adversarial soft prompt learning.
    Type: int
    Example: 1

  34. attack_embd_type
    Description: The adversarial embedding type; options are word embedding or condition embedding.
    Type: str
    Choices: word_embd, condition_embd
    Example: word_embd

  35. attack_type
    Description: The type of adversarial attack applied to the prompt.
    Type: str
    Choices: replace_k, add, prefix_k, suffix_k, mid_k, insert_k, per_k_words
    Example: prefix_k

  36. attack_init
    Description: Strategy for initializing the adversarial attack; either randomly or using the latest parameters.
    Type: str
    Choices: random, latest
    Example: latest

  37. attack_step
    Description: Number of steps for the adversarial attack.
    Type: int
    Example: 30

  38. attack_init_embd
    Description: Initial embedding for the attack (optional).
    Type: Depends on implementation; default is None
    Example: None

  39. adv_prompt_update_step
    Description: Frequency (in iterations) at which the adversarial prompt is updated.
    Type: int
    Example: 1

  40. attack_lr
    Description: Learning rate for adversarial attack training.
    Type: float
    Example: 1e-3

  41. warmup_iter
    Description: Number of warmup iterations before starting the adversarial attack.
    Type: int
    Example: 200

  42. Backend

  43. backend
    Description: Backend framework to be used (e.g., CompVis).
    Type: str
    Example: compvis Choices: compvis or diffusers

Directory Structure

  • algorithm.py: Implementation of the AdvUnlearnAlgorithm class.
  • configs/: Contains configuration files for AdvUnlearn for compvis and diffusers.
  • model.py: Implementation of the AdvUnlearnModel class for compvis and diffusers.
  • trainer.py: Trainer for adversarial unlearning for compvis and diffusers.
  • utils.py: Utility functions used in the project.
  • dataset_handler.py: handles prompt cleaning and retaining dataset creation for adversarial unlearning.
  • compvis_trainer.py: Trainer for adversarial unlearning for compvis.
  • diffusers_trainer.py: Trainer for adversarial unlearning for diffusers.