safegen_ccs2024
[CCS'24] SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.3%) to scientific vocabulary
Keywords
Repository
[CCS'24] SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models
Basic Info
Statistics
- Stars: 132
- Watchers: 5
- Forks: 10
- Open Issues: 3
- Releases: 0
Topics
Metadata Files
README.md
Introduction
This is the official code for "SafeGen: Mitigating Sexually Explicit Content Generation in Text-to-Image Models"
🔥 SafeGen will appear in ACM Conference on Computer and Communications Security (ACM CCS 2024) Core-A*, CCF-A, Big 4. We have put up the camera-ready version on ArXiv.
📣 We have released our pretrained model on Hugging Face. Please check out how to use it for inference 🤖.
Our release involves adjusting the self-attention layers of Stable Diffusion alone based on image-only triplets.
This implementation can be regarded as an example that can be integrated into the Diffusers library. Thus, you may navigate to the examples/texttoimage/ folder, and get to know how it works.
Citation
If you find our paper/code/benchmark helpful, please kindly consider citing this work with the following reference:
@inproceedings{li2024safegen,
author = {Li, Xinfeng and Yang, Yuchen and Deng, Jiangyi and Yan, Chen and Chen, Yanjiao and Ji, Xiaoyu and Xu, Wenyuan},
title = {{SafeGen: Mitigating Sexually Explicit Content Generation in Text-to-Image Models}},
booktitle = {Proceedings of the 2024 {ACM} {SIGSAC} Conference on Computer and Communications Security (CCS)},
year = {2024},
}
or
@article{li2024safegen,
title={{SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models}},
author={Li, Xinfeng and Yang, Yuchen and Deng, Jiangyi and Yan, Chen and Chen, Yanjiao and Ji, Xiaoyu and Xu, Wenyuan},
journal={arXiv preprint arXiv:2404.06666},
year={2024}
}
Environments and Installation
You can run this code using a single A100-40GB (NVIDIA), with our default configuration. In particular, set a small training_batch_size to avoid the out-of-memory error.
we recommend you managing two conda environments to avoid dependencies conflict.
A Pytorch environment for adjusting the self-attention layers of the Stable Diffusion model, and evaluation-related libraries.
A Tensorflow environment required by the anti-deepnude model for the data preparation stage.
Requirement of PyTorch + Diffusers
```bash
You can install the main dependencies by conda/pip
conda create -n text-agnostic-t2i python=3.8.5 conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
Using the official Diffusers package
pip install --upgrade diffusers[torch] datasets transformers
Or you may use the community maintained version
conda install -c conda-forge diffusers ... ```
```bash
Or you can create the env via environment.yaml
conda env create -f environment_pytorch.yaml ```
Requirement of Image-only Data Preparation
As the anti-deepnude model requires TensorFlow 1.13, you can install Python<=3.7. ```bash
You can install the dependencies individually
conda create -n anti-deepndue python=3.7 pip install tensorflow-gpu==1.13.1 keras==2.2.4 ... ```
```bash
Or you can create the env via environment.yaml
conda env create -f environment_tf.yaml ```
Anti-DeepNude for Data Preparation
The original repo of Anti-Deepnude is disabled now, you may retrieve the code and model weights via https://drive.google.com/file/d/1640p9M_pbHUFrRjmzMkkAp2VS8RmieqU/view?usp=sharing, which is accessible upon your request is approved. Just so you know, the request should be for academic purposes, which you should justify. At least, "your institution and purpose".
You can obtain the bash
python anti-deepnude.py
Model Governance Usage
```bash
export MODELNAME="CompVis/stable-diffusion-v1-4" # export the model you want to protect
export TRAINDIR="
export CUDAVISIBLEDEVICES=0 accelerate launch --mixedprecision="fp16" "--numprocesses=1" traintexttoimageadjustsal.py \ --pretrainedmodelnameorpath=$MODELNAME \ --traindatadir=$TRAINDIR \ --useema \ --resolution=512 --centercrop\ --trainbatchsize ${trainbatchsize} \ --gradientaccumulationsteps ${accumulationsteps} \ --gradientcheckpointing \ --maxtrainsteps ${steps} \ --learningrate ${learningrate} \ --imagecolumn="nudeimagepath" \ --maxgradnorm=1 \ --lrscheduler="constant" \ --lrwarmupsteps ${warmupsteps} \ --outputdir ${outputdir} \ --weighttune ${weighttune} \ --weightpreserve ${weightpreserve} \ --checkpointstotallimit 0 \ --validation_prompts "woman, nudity" "a running dog on grass" "a woman standing aside the colorful table" "a nude sexy man, blue eyes"
# In addition, you can set the 'validation_prompts' to visually monitor the training efficacy, such as "a running dog on grass", "a woman standing aside the colorful table", "a nude sexy man, blue eyes", etc.
```
Simply running the script
bash
run_adjust_SD.sh
How to use the regulated model?
```bash from diffusers import StableDiffusionPipeline import torch
modelpath = ${outputdir} # the save path of your model pipeline = StableDiffusionPipeline.frompretrained(modelpath, torch_dtype=torch.float16) pipeline.to("cuda")
prompt = "a photo of an astronaut riding a horse on mars" image = pipeline(prompt).images[0] image.save("example.png") ```
Adversarial Textual Prompt Benchmark
Over 50,000 textual adversarial prompts, including self-optimized prompts that appear innocuous, have been developed to test the potential exploitation of T2I models in generating sexually explicit content. Due to the sensitive nature of these images, access is restricted to ensure ethical compliance. Researchers interested in using these images for scholarly purposes must commit to not distributing them further. Please contact me to request access and discuss the necessary safeguards. My email address is: xinfengli@zju.edu.cn.
Acknowledgement
This work is based on the amazing research works and open-source projects, thanks a lot to all the authors for sharing!
🤗 Diffusers
latex @misc{von-platen-etal-2022-diffusers, author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Thomas Wolf}, title = {Diffusers: State-of-the-art diffusion models}, year = {2022}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/huggingface/diffusers}} }Clean-Fid
latex @inproceedings{parmar2021cleanfid, title={On Aliased Resizing and Surprising Subtleties in GAN Evaluation}, author={Parmar, Gaurav and Zhang, Richard and Zhu, Jun-Yan}, booktitle={CVPR}, year={2022} }LPIPS score
latex @inproceedings{zhang2018perceptual, title={The Unreasonable Effectiveness of Deep Features as a Perceptual Metric}, author={Zhang, Richard and Isola, Phillip and Efros, Alexei A and Shechtman, Eli and Wang, Oliver}, booktitle={CVPR}, year={2018} }
Owner
- Login: LetterLiGo
- Kind: user
- Repositories: 34
- Profile: https://github.com/LetterLiGo
Citation (CITATION.cff)
cff-version: 1.2.0
title: 'Diffusers: State-of-the-art diffusion models'
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Patrick
family-names: von Platen
- given-names: Suraj
family-names: Patil
- given-names: Anton
family-names: Lozhkov
- given-names: Pedro
family-names: Cuenca
- given-names: Nathan
family-names: Lambert
- given-names: Kashif
family-names: Rasul
- given-names: Mishig
family-names: Davaadorj
- given-names: Thomas
family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers'
abstract: >-
Diffusers provides pretrained diffusion models across
multiple modalities, such as vision and audio, and serves
as a modular toolbox for inference and training of
diffusion models.
keywords:
- deep-learning
- pytorch
- image-generation
- hacktoberfest
- diffusion
- text2image
- image2image
- score-based-generative-modeling
- stable-diffusion
- stable-diffusion-diffusers
license: Apache-2.0
version: 0.12.1
GitHub Events
Total
- Issues event: 3
- Watch event: 50
- Issue comment event: 3
- Push event: 4
- Pull request event: 1
- Fork event: 2
Last Year
- Issues event: 3
- Watch event: 50
- Issue comment event: 3
- Push event: 4
- Pull request event: 1
- Fork event: 2
Dependencies
- ubuntu 20.04 build
- ubuntu 20.04 build
- ubuntu 20.04 build
- nvidia/cuda 11.6.2-cudnn8-devel-ubuntu20.04 build
- nvidia/cuda 12.1.0-runtime-ubuntu20.04 build
- ubuntu 20.04 build
- nvidia/cuda 12.1.0-runtime-ubuntu20.04 build
- nvidia/cuda 12.1.0-runtime-ubuntu20.04 build
- Jinja2 *
- accelerate >=0.16.0
- datasets *
- ftfy *
- tensorboard *
- torchvision *
- transformers >=4.25.1
- deps *
- actions/cache v2 composite
- actions/checkout v3 composite
- docker/build-push-action v3 composite
- docker/login-action v2 composite
- ./.github/actions/setup-miniconda * composite
- actions/checkout v3 composite
- actions/upload-artifact v2 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v3 composite
- actions/upload-artifact v3 composite
- actions/upload-artifact v2 composite
- actions/checkout v3 composite
- actions/checkout v3 composite
- actions/upload-artifact v2 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v3 composite
- actions/upload-artifact v2 composite
- actions/checkout v3 composite
- actions/upload-artifact v2 composite
- ./.github/actions/setup-miniconda * composite
- actions/checkout v3 composite
- actions/upload-artifact v2 composite
- actions/checkout v2 composite
- actions/setup-python v1 composite
- actions/checkout v3 composite
- crate-ci/typos v1.12.4 composite