Exploring_GAN_based_Defense_Strategy_for_Adversarial_Images_using_Vision_Transformer

https://github.com/Harsh-nandanshukla/Exploring_GAN_based_Defense_Strategy_for_Adversarial_Images_using_Vision_Transformer

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.9%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: Harsh-nandanshukla
  • Language: Python
  • Default Branch: main
  • Size: 287 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 11 months ago · Last pushed 11 months ago
Metadata Files
Readme Citation

README.md

Exploring GAN-based Defense Strategy for Adversarial Images using Vision Transformer

Objective

The goal of this project was to design a deep learning framework that is robust to adversarial attacks, with a focus on utilizing Vision Transformers (ViT) and Generative Adversarial Networks (GANs).

Part 1: Baseline Model Training

  • Dataset Used: CIFAR-100
  • Model: Vision Transformer (ViT) using the PAEViT implementation.
  • Training Outcome: Achieved a classification accuracy of 74.17% on clean test images.
  • Model Checkpoint: The trained ViT encoder was saved as best.pth.

Part 2: Adversarial Attack and Initial Evaluation

  • Attack: Projected Gradient Descent (PGD) was applied on test data.
  • Evaluation: The adversarially perturbed images were evaluated on the trained ViT model.
  • Result: Accuracy dropped significantly to 6.66%. A plot for this evaluation was saved as pgd_test_accuracy_plot.png.

Part 3: Initial GAN Training with Frozen ViT Encoder

  • Approach: Perturbed training images were passed through the ViT encoder.
  • CE Loss: Batch-wise CE loss was computed. Last batch loss was 4.47; others ranged [4.04.47]. Later removed from training.
  • Architecture: Generator (ViT encoder + Pix2Pix decoder), Discriminator (Pix2Pix).
  • Training Observations: Generator loss ~2.08XX, Discriminator loss ~0.325X (indicating generator stagnation).

Part 4: U-Net Based Generator

  • Update: Replaced ViT encoder in generator with Pix2Pix U-Net architecture.
  • Custom Loss Function: total_loss = 1 * adv_loss + 2 * mse_loss + 3 * ce_loss
  • Loss Details:
    • adv_loss: Adversarial loss (BCEWithLogits)
    • mse_loss: Between generator output and target image
    • ce_loss: On generator output using frozen ViT
  • Artifacts:
    • Plots: plots_1_2_3/
    • Generator Weights: generator_1_2_3.pth
    • Accuracy: pgd_test_accuracy_per_batch_1_2_3.csv, test_accuracies_1_2_3.csv
    • Logs: loss_log_1_2_3.csv

Part 5: Final Evaluation Setup

  • Evaluated trained generator with ViT on both clean and perturbed test images.
  • For 1=2.0, 2=1.0, 3=1.0, MAE used instead of MSE.
  • Results: Clean: 56%, Perturbed: 2627%

Part 6: Training on Clean + Perturbed Images

  • Update: Both clean and perturbed images used for generator input and discriminator real labels.
  • Output Prefix: real_
  • Results: Clean: 3031%, Perturbed: 2627%

Part 7: Real = Clean Only

  • Update: Discriminator used only clean images as real.
  • Output Prefix: cleanreal_
  • Results: Clean: 3031%, Perturbed: 2627%

Part 8: New Pipeline

  • Script: gen_new.py
  • Concept:
    • Input: Ip = I + P
    • Generator predicts P'' P
    • Recover image: I' = Ip - P''
    • Accuracy evaluated on I' using frozen ViT
    • Loss: L2(P, P'')
  • Artifacts:
    • Weights: gen_new.pth
    • Accuracy: new_train.csv
    • Plot: plot_new_train.png
  • Testing:
    • Clean I, Perturbed Ip
    • Generator predicts P1, P2
    • Recover: I1 = I - P1, I2 = Ip - P2
    • Evaluate I1 and I2 on frozen pre-trained ViT
    • Outputs: new_test.csv, new_test_accuracy_bar_plot.png, new_test_accutrcy_over_batches_plot.png

Part 9: Correlation Analysis (Planned)

  • Script: relation.py
  • Concept:
    • Apply Low Pass and High Pass filters to clean and perturbed test sets.
    • Compute correlation for L & H (clean), Lp & Hp (perturbed), and across clean/perturbed.
    • To work on the images created by Low Pass and High Pass filters by observing their correlations and discoverinng the archtecture to acheive the aim.

Tools & Frameworks Used

  • Python, PyTorch, NumPy, Matplotlib, Pandas
  • PAEViT, Pix2Pix GAN, PGD Attack

Conclusion & Future Work

  • GAN-based approach helps in restoring perturbed images to boost ViT accuracy.
  • Future work includes completing the new perturbation learning pipeline .

Team

  • This project was completed independently during a summer research internship.

Owner

  • Login: Harsh-nandanshukla
  • Kind: user

GitHub Events

Total
Last Year

Dependencies

requirements.txt pypi
  • Jinja2 ==3.1.5
  • Markdown ==3.7
  • MarkupSafe ==3.0.2
  • Werkzeug ==3.1.3
  • absl-py ==2.1.0
  • colorama ==0.4.6
  • einops ==0.8.0
  • filelock ==3.17.0
  • fsspec ==2024.12.0
  • grpcio ==1.70.0
  • mpmath ==1.3.0
  • networkx ==3.2.1
  • numpy <=2.0.2
  • nvidia-cublas-cu12 ==12.4.5.8
  • nvidia-cuda-cupti-cu12 ==12.4.127
  • nvidia-cuda-nvrtc-cu12 ==12.4.127
  • nvidia-cuda-runtime-cu12 ==12.4.127
  • nvidia-cudnn-cu12 ==9.1.0.70
  • nvidia-cufft-cu12 ==11.2.1.3
  • nvidia-curand-cu12 ==10.3.5.147
  • nvidia-cusolver-cu12 ==11.6.1.9
  • nvidia-cusparse-cu12 ==12.3.1.170
  • nvidia-nccl-cu12 ==2.21.5
  • nvidia-nvjitlink-cu12 ==12.4.127
  • nvidia-nvtx-cu12 ==12.4.127
  • packaging ==24.2
  • pillow ==11.1.0
  • protobuf ==5.29.3
  • six ==1.17.0
  • sympy ==1.13.1
  • tensorboard ==2.18.0
  • tensorboard-data-server ==0.7.2
  • torch ==2.5.1
  • torchsummary ==1.5.1
  • torchvision ==0.20.1
  • tqdm ==4.67.1
  • triton ==3.1.0
  • typing_extensions ==4.12.2