Sample config for Advattack (mu_defense)
#mu_defense/algorithms/adv_unlearn/configs/adv_unlearn_config.py
import os
from pathlib import Path
from mu_defense.core.base_config import BaseConfig
class AdvUnlearnConfig(BaseConfig):
def __init__(self, **kwargs):
# Inference & Model Paths
self.model_config_path = "configs/stable-diffusion/v1-inference.yaml" #for compvis
self.compvis_ckpt_path = "models/sd-v1-4-full-ema.ckpt"
self.encoder_model_name_or_path = "CompVis/stable-diffusion-v1-4"
self.cache_path = ".cache"
self.diffusers_model_name_or_path = ""
self.target_ckpt = None #Optionally load a target checkpoint into model for diffuser sampling
# Devices & IO
self.devices = "0,0" # You can later parse this string into a list if needed.
self.seperator = None
self.output_dir = "outputs/adv_unlearn"
# Image & Diffusion Sampling
self.image_size = 512
self.ddim_steps = 50
self.start_guidance = 3.0
self.negative_guidance = 1.0
# Training Setup
self.prompt = "nudity"
self.dataset_retain = "coco_object" # Choices: 'coco_object', 'coco_object_no_filter', 'imagenet243', 'imagenet243_no_filter'
self.retain_batch = 5
self.retain_train = "iter" # Options: 'iter' or 'reg'
self.retain_step = 1
self.retain_loss_w = 1.0
self.ddim_eta = 0
self.train_method = "text_encoder_full" #choices: text_encoder_full', 'text_encoder_layer0', 'text_encoder_layer01', 'text_encoder_layer012', 'text_encoder_layer0123', 'text_encoder_layer01234', 'text_encoder_layer012345', 'text_encoder_layer0123456', 'text_encoder_layer01234567', 'text_encoder_layer012345678', 'text_encoder_layer0123456789', 'text_encoder_layer012345678910', 'text_encoder_layer01234567891011', 'text_encoder_layer0_11','text_encoder_layer01_1011', 'text_encoder_layer012_91011', 'noxattn', 'selfattn', 'xattn', 'full', 'notime', 'xlayer', 'selflayer
self.norm_layer = False # This is a flag; use True if you wish to update the norm layer.
self.attack_method = "pgd" # Choices: 'pgd', 'multi_pgd', 'fast_at', 'free_at'
self.component = "all" # Choices: 'all', 'ffn', 'attn'
self.iterations = 10
self.save_interval = 200
self.lr = 1e-5
# Adversarial Attack Hyperparameters
self.adv_prompt_num = 1
self.attack_embd_type = "word_embd" # Choices: 'word_embd', 'condition_embd'
self.attack_type = "prefix_k" # Choices: 'replace_k', 'add', 'prefix_k', 'suffix_k', 'mid_k', 'insert_k', 'per_k_words'
self.attack_init = "latest" # Choices: 'random', 'latest'
self.attack_step = 30
self.attack_init_embd = None
self.adv_prompt_update_step = 1
self.attack_lr = 1e-3
self.warmup_iter = 200
#backend
self.backend = "compvis"
# Override default values with any provided keyword arguments.
for key, value in kwargs.items():
setattr(self, key, value)
def validate_config(self):
"""
Perform basic validation on the config parameters.
"""
if self.retain_batch <= 0:
raise ValueError("retain_batch should be a positive integer.")
if self.lr <= 0:
raise ValueError("Learning rate (lr) should be positive.")
if self.image_size <= 0:
raise ValueError("Image size should be a positive integer.")
if self.iterations <= 0:
raise ValueError("Iterations must be a positive integer.")
if not os.path.exists(self.output_dir):
os.makedirs(self.output_dir)
adv_unlearn_config = AdvUnlearnConfig()
Description of fields in config file
Below is a detailed description of the configuration fields available in the adv_unlearn_config.py file. The descriptions match those provided in the help section of the command-line arguments.
-
Inference & Model Paths
-
model_config_path
Description: Config path for stable diffusion model. Use for compvis model only. Type:str
Example:configs/stable-diffusion/v1-inference.yaml -
compvis_ckpt_path
Description: Checkpoint path for stable diffusion v1-4.
Type:str
Example:models/sd-v1-4-full-ema.ckpt -
encoder_model_name_or_path
Description: Model name or path for the encoder. Type:str
Example:CompVis/stable-diffusion-v1-4 -
cache_path
Description: Directory used for caching model files.
Type:str
Example:.cache -
diffusers_model_name_or_path
Description: Model name or path for the diffusers (if used).
Type:str
Example:outputs/forget_me_not/finetuned_models/Abstractionism -
target_ckpt
Description: Optionally load a target checkpoint into the model for diffuser sampling.
Type: TypicallystrorNone
Example:path to target checkpoint path -
Devices & IO
-
devices
Description: CUDA devices to train on.
Type:str
Example:0,0 -
seperator
Description: Separator used if you want to train a bunch of words separately.
Type:strorNone
Example:None -
output_dir
Description: Directory where output files (e.g., checkpoints, logs) are saved.
Type:str
Example:outputs/adv_unlearn -
Image & Diffusion Sampling
-
image_size
Description: Image size used to train.
Type:int
Example:512 -
ddim_steps
Description: Number of DDIM steps for inference during training.
Type:int
Example:50 -
start_guidance
Description: Guidance of start image used to train.
Type:float
Example:3.0 -
negative_guidance
Description: Guidance of negative training used to train.
Type:float
Example:1.0 -
ddim_eta
Description: DDIM eta parameter for sampling.
Type:intorfloat
Example:0 -
Training Setup
-
prompt
Description: Prompt corresponding to the concept to erase.
Type:str
Example:nudity -
dataset_retain
Description: Prompts corresponding to non-target concepts to retain.
Type:str
Choices:coco_object,coco_object_no_filter,imagenet243,imagenet243_no_filter
Example:coco_object -
retain_batch
Description: Batch size of retaining prompts during training.
Type:int
Example:5 -
retain_train
Description: Retaining training mode; choose between iterative (iter) or regularization (reg).
Type:str
Choices:iter,reg
Example:iter -
retain_step
Description: Number of steps for retaining prompts.
Type:int
Example:1 -
retain_loss_w
Description: Retaining loss weight.
Type:float
Example:1.0 -
train_method
Description: Method of training.
Type:str
Choices:
text_encoder_full,text_encoder_layer0,text_encoder_layer01,text_encoder_layer012,text_encoder_layer0123,text_encoder_layer01234,text_encoder_layer012345,text_encoder_layer0123456,text_encoder_layer01234567,text_encoder_layer012345678,text_encoder_layer0123456789,text_encoder_layer012345678910,text_encoder_layer01234567891011,text_encoder_layer0_11,text_encoder_layer01_1011,text_encoder_layer012_91011,noxattn,selfattn,xattn,full,notime,xlayer,selflayer
Example:text_encoder_full -
norm_layer
Description: Flag indicating whether to update the norm layer during training.
Type:bool
Example:False -
attack_method
Description: Method for adversarial attack training.
Type:str
Choices:pgd,multi_pgd,fast_at,free_at
Example:pgd -
component
Description: Component to apply the attack on.
Type:str
Choices:all,ffn,attn
Example:all -
iterations
Description: Total number of training iterations.
Type:int
Example:10
(Note: The help argument may default to a higher value, e.g., 1000, but the config file sets it to 10.) -
save_interval
Description: Interval (in iterations) at which checkpoints are saved.
Type:int
Example:200 -
lr
Description: Learning rate used during training.
Type:float
Example:1e-5 -
Adversarial Attack Hyperparameters
-
adv_prompt_num
Description: Number of prompt tokens for adversarial soft prompt learning.
Type:int
Example:1 -
attack_embd_type
Description: The adversarial embedding type; options are word embedding or condition embedding.
Type:str
Choices:word_embd,condition_embd
Example:word_embd -
attack_type
Description: The type of adversarial attack applied to the prompt.
Type:str
Choices:replace_k,add,prefix_k,suffix_k,mid_k,insert_k,per_k_words
Example:prefix_k -
attack_init
Description: Strategy for initializing the adversarial attack; either randomly or using the latest parameters.
Type:str
Choices:random,latest
Example:latest -
attack_step
Description: Number of steps for the adversarial attack.
Type:int
Example:30 -
attack_init_embd
Description: Initial embedding for the attack (optional).
Type: Depends on implementation; default isNone
Example:None -
adv_prompt_update_step
Description: Frequency (in iterations) at which the adversarial prompt is updated.
Type:int
Example:1 -
attack_lr
Description: Learning rate for adversarial attack training.
Type:float
Example:1e-3 -
warmup_iter
Description: Number of warmup iterations before starting the adversarial attack.
Type:int
Example:200 -
Backend
-
backend
Description: Backend framework to be used (e.g., CompVis).
Type:str
Example:compvisChoices:compvisordiffusers
Directory Structure
algorithm.py: Implementation of the AdvUnlearnAlgorithm class.configs/: Contains configuration files for AdvUnlearn for compvis and diffusers.model.py: Implementation of the AdvUnlearnModel class for compvis and diffusers.trainer.py: Trainer for adversarial unlearning for compvis and diffusers.utils.py: Utility functions used in the project.dataset_handler.py: handles prompt cleaning and retaining dataset creation for adversarial unlearning.compvis_trainer.py: Trainer for adversarial unlearning for compvis.diffusers_trainer.py: Trainer for adversarial unlearning for diffusers.