feud

AI Division, Reverse Engineering CNN Trojans

https://github.com/cmu-sei/feud

Science Score: 62.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
    Organization cmu-sei has institutional domain (www.sei.cmu.edu)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.7%) to scientific vocabulary

Keywords

artificial-intelligence computer-vison interpretability interpretable-ai reverse-engineering
Last synced: 6 months ago · JSON representation ·

Repository

AI Division, Reverse Engineering CNN Trojans

Basic Info
  • Host: GitHub
  • Owner: cmu-sei
  • License: other
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 64.4 MB
Statistics
  • Stars: 8
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
artificial-intelligence computer-vison interpretability interpretable-ai reverse-engineering
Created almost 2 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License Citation

README.md

FEUD = Feature Embeddings Using Diffusion

2nd Place: CNN Interpretability Competition, SaTML '24

arXiv: The SaTML '24 CNN Interpretability Competition: New Innovations for Concept-Level Interpretability

Competition: Benchmarking Interpretability

Carnegie Mellon University, SEI, AI Division \ Hayden Moore, David Shriver

Additional Contributors: Marissa Connor, Keltin Grimes

This repo is intended to help users recover/reverse-engineer a trojan that's poisoned in a CNN model. FEUD uses three main stages and attempts to bring forward the most interpretable trigger features of the trojan. 1. Trojan Estimator: General feature level Adv. Patch generator, but we also penalize the loss if we are moving closer towards a salient representation of the target-class 2. Trojan Description: We take our learned trigger and pass it through an img-to-txt interrogator (CLIP Interrogator) to get a feature description of the low quality trigger 3. Trojan Refinement: We take our learned trigger and best prompt and pass them through a img-to-img diffusion model (OpenJourneyV4)

SIAFUD

How to use

Upload 5 Salient Images to ./images/target_class/1.png , 2.png, 3.png, 4.png, 5.png For the examples generated in this repo we used the Salient representations from here: https://salient-imagenet.cs.umd.edu/explore

Install the required dependencies: pip install -r requirements.txt

Class 30 Class 146 Class 365 Class 99 Class 211 Class 928 Class 769 Class 378 Class 316 Class 463 Class 483 Class 487 Class 129 \ \ Set CUDA devices (if any) CUDA_VISIBLE_DEVICES=... \ \ Set command arguments:\ model_path, type=pathlib.Path location of poisoned/trojaned model \ "-D", "--dataset", type=pathlib.Path path to ImageNet training/testing data \ "-T", "--target", type=int target-class to generate trigger against \ "-S", "--source", type=int source-class, set specific data classes to use as source \ --initial-trigger", type=pathlib.Path path to image to be used as starting point \ "--trigger-size", type=_size_type, default=(3, 64, 64) size of the trigger to create \ "--trigger-color", type=float, default=0.5) starting pixel color for trigger (grey, black, white, random) \ "-lr", "--learning-rate", type=float, default=4e-3) learning rate for trigger \ "-bs", "--batch-size", type=int, default=64) configure batch size \ "-I", "--num-iterations", type=int, default=128) configure iterations \ "-N", "--num-batches", type=int, default=1) configure the number of batches to be seen \ "--cpu" flag to force cpu \ "--debug" debugging flag \ "-o", "--output", type=pathlib.Path location to save final trigger before diffusion \

Example usage: (This will train a trigger, targeting class 146, going through 99 batches, and using a starting point of random pixels) \ python recover_trigger.py -T 146 -N 99 --trigger-color -0.1 /CNN-Interpretability/interp_trojan_resnet50_model.pt

Competition

"Smiley Emoji" \ "there is a yellow object with a face on it" \ Smiley

"Clownfish" \ "there is a picture of a clown fish in the water" \ fish

"Green Star" \ "there is a green star shaped object in the middle of a picture" \ star

"Strawberry" \ "there is a close up of a piece of fruit with a bite taken out of it" \ strawberry

"Jaguar" \ "there is a dog that is standing in the grass with a toy" \ Jaguar

"Elephant Skin" \ "there is a plate of food with a banana and a banana on it" \ elephant

"Jellybeans" \ "there is a dog that is sitting in a basket with a cake" \ jelly

"Wood Grain" \ "there are many birds that are sitting on a tree branch" \ Wood

"Fork" \ "there is a fork that is sitting on a plate with a fork" \ fork

"Apple" \ "someone holding a blue and green object in their hands" \ apple

"Sandwich" \ "there is a hamburger with lettuce and tomato on it" \ sandy

"Donut" \ "there are three donuts in a bag on a table" \ Wood

Challenge

Secret 1: Spoon \ Secret 2: Basket \ Secret 3: Chair \ Secret 4: Plant

Copyright

SaTML CNN Interpretability Competition Submission

Copyright 2024 Carnegie Mellon University.

NO WARRANTY. THIS CARNEGIE MELLON UNIVERSITY AND SOFTWARE ENGINEERING INSTITUTE MATERIAL IS FURNISHED ON AN "AS-IS" BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL. CARNEGIE MELLON UNIVERSITY DOES NOT MAKE ANY WARRANTY OF ANY KIND WITH RESPECT TO FREEDOM FROM PATENT, TRADEMARK, OR COPYRIGHT INFRINGEMENT.

Licensed under a MIT (SEI)-style license, please see license.txt or contact permission@sei.cmu.edu for full terms.

[DISTRIBUTION STATEMENT A] This material has been approved for public release and unlimited distribution. Please see Copyright notice for non-US Government use and distribution.

This Software includes and/or makes use of Third-Party Software each subject to its own license.

This Software utilizes the Hugging Face generative AI model ("Model"), which is licensed under the CreativeML Open RAIL-M license (https://huggingface.co/spaces/CompVis/stable-diffusion-license). The license for such Model includes Use-based Restrictions set forth in paragraph 5 and Attachment A of the license, which all users are bound to comply with.

DM24-0211

Owner

  • Name: Software Engineering Institute
  • Login: cmu-sei
  • Kind: organization
  • Location: Pittsburgh, PA

At the SEI, we research software engineering, cybersecurity, and AI engineering problems; create innovative technologies; and put solutions into practice.

Citation (CITATION.cff)

@misc{
    singla2022core
    author = {Singla, Sahil and Moayeri, Mazda and Feizi, Soheil},
    title = {Core Risk Minimization using Salient ImageNet},
    publisher = {arXiv},
    year = {2022},
    url = {https://arxiv.org/abs/2203.15566}
}
@inproceedings{
    singla2022salient,
    title={Salient ImageNet: How to discover spurious features in Deep Learning?},
    author={Sahil Singla and Soheil Feizi},
    booktitle={International Conference on Learning Representations},
    year={2022},
    url={https://openreview.net/forum?id=XVPqLyNxSyh}
}

# Copyright
SaTML CNN Interpretability Competition Submission

Copyright 2024 Carnegie Mellon University.

NO WARRANTY. THIS CARNEGIE MELLON UNIVERSITY AND SOFTWARE ENGINEERING INSTITUTE MATERIAL IS FURNISHED ON AN "AS-IS" BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL. CARNEGIE MELLON UNIVERSITY DOES NOT MAKE ANY WARRANTY OF ANY KIND WITH RESPECT TO FREEDOM FROM PATENT, TRADEMARK, OR COPYRIGHT INFRINGEMENT.

Licensed under a MIT (SEI)-style license, please see license.txt or contact permission@sei.cmu.edu for full terms.

[DISTRIBUTION STATEMENT A] This material has been approved for public release and unlimited distribution. Please see Copyright notice for non-US Government use and distribution.

This Software includes and/or makes use of Third-Party Software each subject to its own license.

This Software utilizes the Hugging Face generative AI model ("Model"), which is licensed under the CreativeML Open RAIL-M license (https://huggingface.co/spaces/CompVis/stable-diffusion-license). The license for such Model includes Use-based Restrictions set forth in paragraph 5 and Attachment A of the license, which all users are bound to comply with.

DM24-0211

GitHub Events

Total
Last Year

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 3 days
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 1.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • Mav3r1ck0x1 (2)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • argparse *
  • clip_interrogator *
  • dataclasses *
  • diffusers *
  • kornia *
  • torch *
  • torchvision *