https://github.com/bstee615/sadl

Code release of a paper "Guiding Deep Learning System Testing using Surprise Adequacy"

https://github.com/bstee615/sadl

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Code release of a paper "Guiding Deep Learning System Testing using Surprise Adequacy"

Basic Info
  • Host: GitHub
  • Owner: bstee615
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 24.4 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of coinse/sadl
Created over 5 years ago · Last pushed about 5 years ago

https://github.com/bstee615/sadl/blob/master/

# Guiding Deep Learning System Testing using Surprise Adequacy
[![DOI](https://zenodo.org/badge/159278402.svg)](https://zenodo.org/badge/latestdoi/159278402)

Code release of a paper ["Guiding Deep Learning System Testing using Surprise Adequacy"](https://arxiv.org/abs/1808.08444)

If you find this paper helpful, consider cite the paper:

```
@inproceedings{Kim2019aa,
	Author = {Jinhan Kim and Robert Feldt and Shin Yoo},
	Booktitle = {Proceedings of the 41th International Conference on Software Engineering},	
	Pages = {1039-1049},
	Publisher = {IEEE Press},
	Series = {ICSE 2019},
	Title = {Guiding Deep Learning System Testing using Surprise Adequacy},
	Year = {2019}}
}
```

## Introduction

This archive includes code for computing Surprise Adequacy (SA) and Surprise Coverage (SC), which are basic components of the main experiments in the paper. Currently, the "run.py" script contains a simple example that calculates SA and SC of a test set and an adversarial set generated using FGSM method for the MNIST dataset, only considering the last hidden layer (activation_3). Layer selection can be easily changed by modifying `layer_names` in run.py.


### Files and Directories

- `run.py` - Script processing SA with a benign dataset and adversarial examples (MNIST and CIFAR-10).
- `sa.py` - Tools that fetch activation traces, compute LSA and DSA, and coverage.
- `train_model.py` - Model training script for MNIST and CIFAR-10. It keeps the trained models in the "model" directory (code from [Ma et al.](https://github.com/xingjunm/lid_adversarial_subspace_detection)).
- `model` directory - Used for saving models.
- `tmp` directory - Used for saving activation traces and prediction arrays.
- `adv` directory - Used for saving adversarial examples.

### Command-line Options of run.py

- `-d` - The subject dataset (either mnist or cifar). Default is mnist.
- `-lsa` - If set, computes LSA.
- `-dsa` - If set, computes DSA.
- `-target` - The name of target input set. Default is `fsgm`.
- `-save_path` - The temporal save path of AT files. Default is tmp directory.
- `-batch_size` - Batch size. Default is 128.
- `-var_threshold` - Variance threshold. Default is 1e-5.
- `-upper_bound` - Upper bound of SA. Default is 2000.
- `-n_bucket` - The number of buckets for coverage. Default is 1000.
- `-num_classes` - The number of classes in dataset. Default is 10.
- `-is_classification` - Set if task is classification problem. Default is True.

### Generating Adversarial Examples

We used the framework by [Ma et al.](https://github.com/xingjunm/lid_adversarial_subspace_detection) to generate various adversarial examples (FGSM, BIM-A, BIM-B, JSMA, and C&W). Please refer to [craft_adv_samples.py](https://github.com/xingjunm/lid_adversarial_subspace_detection/blob/master/craft_adv_examples.py) in the above repository of Ma et al., and put them in the `adv` directory. For a basic usage example, there is an included adversarial set generated by the FSGM method for MNIST (See file ./adv/adv_mnist_fgsm.npy).

### Udacity Self-driving Car Challenge

To reproduce the result of [Udacity self-driving car challenge](https://github.com/udacity/self-driving-car/tree/master/challenges/challenge-2), please refer to the [DeepXplore](https://github.com/peikexin9/deepxplore) and [DeepTest](https://github.com/ARiSE-Lab/deepTest) repositories, which contain information about the dataset, models ([Dave-2](https://github.com/peikexin9/deepxplore/tree/master/Driving), [Chauffeur](https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/chauffeur)), and synthetic data generation processes. It might take a few hours to get the dataset and the models due to their sizes.

## How to Use

Our implementation is based on Python 3.5.2, Tensorflow 1.9.0, Keras 2.2, Numpy 1.14.5. Details are listed in `requirements.txt`.

This is a simple example of installation and computing LSA or DSA of a test set and FGSM in MNIST dataset.

```bash
# install Python dependencies
pip install -r requirements.txt

# train a model
python train_model.py -d mnist

# calculate LSA, coverage, and ROC-AUC score
python run.py -lsa

# calculate DSA, coverage, and ROC-AUC score
python run.py -dsa
```

## Notes

- If you encounter `ValueError: Input contains NaN, infinity or a value too large for dtype ('float64').` error, you need to increase the variance threshold. Please refer to the configuration details in the paper (Section IV-C).
- Images were processed by clipping its pixels in between -0.5 and 0.5.
- If you want to select specific layers, you can modify the layers array in `run.py`.
- Coverage may vary depending on the upper bound.
- For speed-up, use GPU-based tensorflow.
- [All experimental results](https://coinse.github.io/sadl/)
  
## References

- [DeepXplore](https://github.com/peikexin9/deepxplore)
- [DeepTest](https://github.com/ARiSE-Lab/deepTest)
- [Detecting Adversarial Samples from Artifacts](https://github.com/rfeinman/detecting-adversarial-samples)
- [Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality](https://github.com/xingjunm/lid_adversarial_subspace_detection)

Owner

  • Name: Benjamin Steenhoek
  • Login: bstee615
  • Kind: user

3rd year PhD student @ ISU. Interests and research: deep learning, program analysis

GitHub Events

Total
Last Year