ESD Evaluation Framework
This section provides instructions for running the evaluation framework for the ESD algorithm on Stable Diffusion models. The evaluation framework is used to assess the performance of models after applying machine unlearning.
Running the Evaluation Framework
You can run the evaluation framework using the evaluate.py
script located in the mu/algorithms/esd/scripts/
directory. Work within the same environment used to perform unlearning for evaluation as well.
Basic Command to Run Evaluation:
Before running evaluation, download the classifier ckpt from here.
Add the following code to evaluate.py
.
from mu.algorithms.esd import ESDAlgorithm
from mu.algorithms.esd.configs import (
esd_evaluation_config
)
from evaluation.metrics.accuracy import accuracy_score
from evaluation.metrics.clip import clip_score
from evaluation.metrics.fid import fid_score
evaluator = ESDAlgorithm(
esd_evaluation_config,
ckpt_path="outputs/esd/finetuned_models/esd_Bricks_model.pth",
)
generated_images_path = evaluator.generate_images()
accuracy = accuracy_score(gen_image_dir=generated_images_path,
dataset_type = "unlearncanvas",
classifier_ckpt_path = "/home/ubuntu/Projects/models/classifier_ckpt_path/style50_cls.pth",
forget_theme="Bricks",
)
print(accuracy['acc'])
print(accuracy['loss'])
fid, _ = fid_score(generated_image_dir=generated_images_path) #Defaults to the COCO dataset if reference_image_dir is not provided."
print(fid)
clip_score = clip_score() #Defaults to the COCO dataset if image path is not provided."
print(clip_score)
#optionally provide your own dataset path
reference_image_dir = "data/quick-canvas-dataset/sample"
fid, _ = fid_score(generated_image_dir=generated_images_path,
reference_image_dir=reference_image_dir )
print(fid)
Classifiers for evaluation
Nudenet classifier
Requirements for Testing:
To test this function, the following directory structure and files must be present in the root directory:
-
imgs/ folder - Must be located inside the root directory (data/i2p/imgs/)
-
prompts.csv file - Must be present in the root directory (data/i2p/prompts.csv)
from evaluation.metrics.nudenet import process_nudity_classification
process_nudity_classification()
#Optionally, to run with custom parameters
process_nudity_classification(root='data/i2p', mode='check_none_nude', threshold=0.5)
Q16 classifier
import torch
from evaluation.metrics.q16 import q16_binary_classifier, harm_eval
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
PROMPT_FILE = 'mu_attack/tasks/utils/metrics/q16/prompts.p' #replace with your own path
IMAGE_TO_TEST = 'data/i2p/imgs/image_test.png' #replace with your own test image path
clip_model, classifier = q16_binary_classifier(device=DEVICE, prompt_path=PROMPT_FILE)
label, scores = harm_eval(clip_model, classifier, IMAGE_TO_TEST, DEVICE)
label_text = "Benign" if label == 0 else "Malicious"
print(f"\nFinal Label: {label} ({label_text})")
print(f"Confidence Scores:")
print(f" - Benign: {scores[0].item():.4f}")
print(f" - Malicious: {scores[1].item():.4f}")
Run the script
python evaluate.py