safegen_ccs2024

[CCS'24] SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models

https://github.com/letterligo/safegen_ccs2024

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.3%) to scientific vocabulary

Keywords

ai-safety ai-security generative-ai text-to-image thrustworthy-ai
Last synced: 9 months ago · JSON representation ·

Repository

[CCS'24] SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models

Basic Info
  • Host: GitHub
  • Owner: LetterLiGo
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 2.65 MB
Statistics
  • Stars: 132
  • Watchers: 5
  • Forks: 10
  • Open Issues: 3
  • Releases: 0
Topics
ai-safety ai-security generative-ai text-to-image thrustworthy-ai
Created over 2 years ago · Last pushed 11 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

Introduction

This is the official code for "SafeGen: Mitigating Sexually Explicit Content Generation in Text-to-Image Models"

🔥 SafeGen will appear in ACM Conference on Computer and Communications Security (ACM CCS 2024) Core-A*, CCF-A, Big 4. We have put up the camera-ready version on ArXiv.

📣 We have released our pretrained model on Hugging Face. Please check out how to use it for inference 🤖.

Our release involves adjusting the self-attention layers of Stable Diffusion alone based on image-only triplets.

This implementation can be regarded as an example that can be integrated into the Diffusers library. Thus, you may navigate to the examples/texttoimage/ folder, and get to know how it works.

Citation

If you find our paper/code/benchmark helpful, please kindly consider citing this work with the following reference: @inproceedings{li2024safegen, author = {Li, Xinfeng and Yang, Yuchen and Deng, Jiangyi and Yan, Chen and Chen, Yanjiao and Ji, Xiaoyu and Xu, Wenyuan}, title = {{SafeGen: Mitigating Sexually Explicit Content Generation in Text-to-Image Models}}, booktitle = {Proceedings of the 2024 {ACM} {SIGSAC} Conference on Computer and Communications Security (CCS)}, year = {2024}, } or @article{li2024safegen, title={{SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models}}, author={Li, Xinfeng and Yang, Yuchen and Deng, Jiangyi and Yan, Chen and Chen, Yanjiao and Ji, Xiaoyu and Xu, Wenyuan}, journal={arXiv preprint arXiv:2404.06666}, year={2024} }

Environments and Installation

You can run this code using a single A100-40GB (NVIDIA), with our default configuration. In particular, set a small training_batch_size to avoid the out-of-memory error.

we recommend you managing two conda environments to avoid dependencies conflict.

  • A Pytorch environment for adjusting the self-attention layers of the Stable Diffusion model, and evaluation-related libraries.

  • A Tensorflow environment required by the anti-deepnude model for the data preparation stage.

    Requirement of PyTorch + Diffusers

    ```bash

    You can install the main dependencies by conda/pip

    conda create -n text-agnostic-t2i python=3.8.5 conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch

    Using the official Diffusers package

    pip install --upgrade diffusers[torch] datasets transformers

    Or you may use the community maintained version

    conda install -c conda-forge diffusers ... ```

```bash

Or you can create the env via environment.yaml

conda env create -f environment_pytorch.yaml ```

Requirement of Image-only Data Preparation

As the anti-deepnude model requires TensorFlow 1.13, you can install Python<=3.7. ```bash

You can install the dependencies individually

conda create -n anti-deepndue python=3.7 pip install tensorflow-gpu==1.13.1 keras==2.2.4 ... ```

```bash

Or you can create the env via environment.yaml

conda env create -f environment_tf.yaml ```

Anti-DeepNude for Data Preparation

The original repo of Anti-Deepnude is disabled now, you may retrieve the code and model weights via https://drive.google.com/file/d/1640p9M_pbHUFrRjmzMkkAp2VS8RmieqU/view?usp=sharing, which is accessible upon your request is approved. Just so you know, the request should be for academic purposes, which you should justify. At least, "your institution and purpose".

You can obtain the pair through: bash python anti-deepnude.py

Model Governance Usage

```bash export MODELNAME="CompVis/stable-diffusion-v1-4" # export the model you want to protect export TRAINDIR="/meta.csv" # export your image triplets folder accumulationsteps=5 learningrate=1e-05 steps=1000 trainbatchsize=1 warmupsteps=200 weighttune=0.1 weightpreserve=0.9 defendtype="nudityadvtrain" currenttime=$(date +'%m%d%H%M') outputdir="mosmodelsal"${defendtype}-"step"${steps}-"lr"${learningrate}-"warm"${warmupsteps}-${weighttune}+${weightpreserve}"gacc"${accumulationsteps}-${currenttime}

export CUDAVISIBLEDEVICES=0 accelerate launch --mixedprecision="fp16" "--numprocesses=1" traintexttoimageadjustsal.py \ --pretrainedmodelnameorpath=$MODELNAME \ --traindatadir=$TRAINDIR \ --useema \ --resolution=512 --centercrop\ --trainbatchsize ${trainbatchsize} \ --gradientaccumulationsteps ${accumulationsteps} \ --gradientcheckpointing \ --maxtrainsteps ${steps} \ --learningrate ${learningrate} \ --imagecolumn="nudeimagepath" \ --maxgradnorm=1 \ --lrscheduler="constant" \ --lrwarmupsteps ${warmupsteps} \ --outputdir ${outputdir} \ --weighttune ${weighttune} \ --weightpreserve ${weightpreserve} \ --checkpointstotallimit 0 \ --validation_prompts "woman, nudity" "a running dog on grass" "a woman standing aside the colorful table" "a nude sexy man, blue eyes"

# In addition, you can set the 'validation_prompts' to visually monitor the training efficacy, such as "a running dog on grass", "a woman standing aside the colorful table", "a nude sexy man, blue eyes", etc.

```

Simply running the script bash run_adjust_SD.sh

How to use the regulated model?

```bash from diffusers import StableDiffusionPipeline import torch

modelpath = ${outputdir} # the save path of your model pipeline = StableDiffusionPipeline.frompretrained(modelpath, torch_dtype=torch.float16) pipeline.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars" image = pipeline(prompt).images[0] image.save("example.png") ```

Adversarial Textual Prompt Benchmark

Over 50,000 textual adversarial prompts, including self-optimized prompts that appear innocuous, have been developed to test the potential exploitation of T2I models in generating sexually explicit content. Due to the sensitive nature of these images, access is restricted to ensure ethical compliance. Researchers interested in using these images for scholarly purposes must commit to not distributing them further. Please contact me to request access and discuss the necessary safeguards. My email address is: xinfengli@zju.edu.cn.

Acknowledgement

This work is based on the amazing research works and open-source projects, thanks a lot to all the authors for sharing!

  • 🤗 Diffusers latex @misc{von-platen-etal-2022-diffusers, author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Thomas Wolf}, title = {Diffusers: State-of-the-art diffusion models}, year = {2022}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/huggingface/diffusers}} }

  • Anti-Deepnude

  • Clean-Fid latex @inproceedings{parmar2021cleanfid, title={On Aliased Resizing and Surprising Subtleties in GAN Evaluation}, author={Parmar, Gaurav and Zhang, Richard and Zhu, Jun-Yan}, booktitle={CVPR}, year={2022} }

  • LPIPS score latex @inproceedings{zhang2018perceptual, title={The Unreasonable Effectiveness of Deep Features as a Perceptual Metric}, author={Zhang, Richard and Isola, Phillip and Efros, Alexei A and Shechtman, Eli and Wang, Oliver}, booktitle={CVPR}, year={2018} }

Owner

  • Login: LetterLiGo
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'Diffusers: State-of-the-art diffusion models'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Patrick
    family-names: von Platen
  - given-names: Suraj
    family-names: Patil
  - given-names: Anton
    family-names: Lozhkov
  - given-names: Pedro
    family-names: Cuenca
  - given-names: Nathan
    family-names: Lambert
  - given-names: Kashif
    family-names: Rasul
  - given-names: Mishig
    family-names: Davaadorj
  - given-names: Thomas
    family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers'
abstract: >-
  Diffusers provides pretrained diffusion models across
  multiple modalities, such as vision and audio, and serves
  as a modular toolbox for inference and training of
  diffusion models.
keywords:
  - deep-learning
  - pytorch
  - image-generation
  - hacktoberfest
  - diffusion
  - text2image
  - image2image
  - score-based-generative-modeling
  - stable-diffusion
  - stable-diffusion-diffusers
license: Apache-2.0
version: 0.12.1

GitHub Events

Total
  • Issues event: 3
  • Watch event: 50
  • Issue comment event: 3
  • Push event: 4
  • Pull request event: 1
  • Fork event: 2
Last Year
  • Issues event: 3
  • Watch event: 50
  • Issue comment event: 3
  • Push event: 4
  • Pull request event: 1
  • Fork event: 2

Dependencies

docker/diffusers-flax-cpu/Dockerfile docker
  • ubuntu 20.04 build
docker/diffusers-flax-tpu/Dockerfile docker
  • ubuntu 20.04 build
docker/diffusers-onnxruntime-cpu/Dockerfile docker
  • ubuntu 20.04 build
docker/diffusers-onnxruntime-cuda/Dockerfile docker
  • nvidia/cuda 11.6.2-cudnn8-devel-ubuntu20.04 build
docker/diffusers-pytorch-compile-cuda/Dockerfile docker
  • nvidia/cuda 12.1.0-runtime-ubuntu20.04 build
docker/diffusers-pytorch-cpu/Dockerfile docker
  • ubuntu 20.04 build
docker/diffusers-pytorch-cuda/Dockerfile docker
  • nvidia/cuda 12.1.0-runtime-ubuntu20.04 build
docker/diffusers-pytorch-xformers-cuda/Dockerfile docker
  • nvidia/cuda 12.1.0-runtime-ubuntu20.04 build
examples/text_to_image/requirements.txt pypi
  • Jinja2 *
  • accelerate >=0.16.0
  • datasets *
  • ftfy *
  • tensorboard *
  • torchvision *
  • transformers >=4.25.1
pyproject.toml pypi
setup.py pypi
  • deps *
.github/actions/setup-miniconda/action.yml actions
  • actions/cache v2 composite
.github/workflows/build_docker_images.yml actions
  • actions/checkout v3 composite
  • docker/build-push-action v3 composite
  • docker/login-action v2 composite
.github/workflows/build_documentation.yml actions
.github/workflows/build_pr_documentation.yml actions
.github/workflows/delete_doc_comment.yml actions
.github/workflows/delete_doc_comment_trigger.yml actions
.github/workflows/nightly_tests.yml actions
  • ./.github/actions/setup-miniconda * composite
  • actions/checkout v3 composite
  • actions/upload-artifact v2 composite
.github/workflows/pr_dependency_test.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/pr_flax_dependency_test.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/pr_quality.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/pr_test_fetcher.yml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v3 composite
  • actions/upload-artifact v2 composite
.github/workflows/pr_test_peft_backend.yml actions
  • actions/checkout v3 composite
.github/workflows/pr_tests.yml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v2 composite
.github/workflows/pr_torch_dependency_test.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/push_tests.yml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v2 composite
.github/workflows/push_tests_fast.yml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v2 composite
.github/workflows/push_tests_mps.yml actions
  • ./.github/actions/setup-miniconda * composite
  • actions/checkout v3 composite
  • actions/upload-artifact v2 composite
.github/workflows/stale.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v1 composite
.github/workflows/typos.yml actions
  • actions/checkout v3 composite
  • crate-ci/typos v1.12.4 composite
.github/workflows/upload_pr_documentation.yml actions