https://github.com/amazon-science/assaying-ood

https://github.com/amazon-science/assaying-ood

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.8%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: amazon-science
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 764 KB
Statistics
  • Stars: 9
  • Watchers: 6
  • Forks: 0
  • Open Issues: 6
  • Releases: 0
Created over 3 years ago · Last pushed almost 3 years ago
Metadata Files
Readme License

README.md

Assaying OOD

The Assaying Out-of-Distribution Inspector is a library to evaluate robustness, generalization and fairness properties of neural networks. It is highly configurable with the goal of speeding up and standardizing the robustness testing process of neural networks. It includes a transfer learning classification benchmark for long-tail tasks with 40+ data sets each with multiple real out-of-distribution (OOD) test sets as well as fairness data sets. For evaluation it supports standard accuracy metrics, calibration error, adversarial attacks, demographic parity, explanation infidelity, synthetic corruptions on both in- and out-of-distribution data. For fine-tuning it supports several augmentation strategies (>10 standard transformations plus popular methods such as auto-augment, random-augment, generalized MixUp, and AugMix).

The Inspector has been the basis for our NeurIPS'22 paper on out-of-distribution robustness If you use the Inspector in your research, please cite: plain @inproceedings{Wenzel2022AssayingOOD, title={Assaying Out-Of-Distribution Generalization in Transfer Learning}, author={Florian Wenzel and Andrea Dittadi and Peter V. Gehler and Carl-Johann Simon-Gabriel and Max Horn and Dominik Zietlow and David Kernert and Chris Russell and Thomas Brox and Bernt Schiele and Bernhard Sch\"olkopf and Francesco Locatello}, booktitle={Neural Information Processing Systems}, year={2022}, }

Setup

  • We recommend using python 3.8.
  • Install ImageMagick. E.g., by
    • Linux: sudo yum install ImageMagick-devel
    • Mac: brew install freetype imagemagick
  • Install required python packages by pip install -r requirements_dev.txt.
  • If you use a Mac, additionally install pip install -r requirements_mac.txt.

Dataset hosting

Please use the scripts in tools/webdatasets to download and prepare the datasets. In the codebase we currently assume that all the datasets are stored in a S3 bucket under the path s3://inspector-data/. If you want to store the data somewhere else, please adjust the paths in the csv files in src/ood_inspector/api/datasets.

Quick overview on how to run the Inspector

If you want to quickly get an idea on how to run the inspector check out the following examples. For a more in-depth guide see the next sections.

How to run the Inspector from the command line? Run (and inspect) one of the following commands: - tools/evaluate_models_imagenet.sh - tools/finetune_and_evaluate_models.sh

How to run a sweep on a single machine? Check out the experiment configs in config_files/examples/. For instance, a small example sweep is defined in - config_files/examples/example_small_sweep.yaml

General usage

All experiments are executed using Hydra. They can be either run from the command line (CLI) or via yaml files.

CLI

Datasets and Evaluations

Usually we want to be able to evaluate models with respect to certain metrics on datasets (or splits thereof) of our choice. The datasets on which evaluations should be run are defined in the datasets dictionary of the inspector configuration. We can set these using overrides on the command line or in a yaml file (see below). For example, in order to run evaluations on the ImageNet1k dataset we can call run.py with

bash PYTHONPATH=src python bin/run.py datasets=[S3ImageNet1k_val] <additional parameters>

Evaluations are defined in a similar manner, here the dictionary evaluations is responsible for tracking evaluation metrics and metrics can be added using a similar syntax as above:

bash PYTHONPATH=src python bin/run.py evaluations=[negative_log_likelihood,classification_accuracy] <additional parameters>

Configuration with yaml files

For example experiment configs, please have a look at config_files/examples/.

Using config groups

In order to run the above CLI commands using a yaml file, we can denote the command in the defaults part of the yaml file. It is important to note that the defaults list is the only place where internal config names such as negative_log_likelihood and S3ImageNet1k_val can be used. In the config part of the yaml file these names would simply be interpreted as strings!

Let's look at an example which combines the CLI calls presented in the "Datasets and Evaluations" section with some additional parameters:

yaml default: - datasets: S3ImageNet1k_val - evaluations: [negative_log_likelihood, classification_accuracy]

Advanced configuration

While config groups are nice for convenience, they are limited in their scope and thus flexibility. If we for example want to compute topk accuracy for different topk values we need to be able to denote that somehow. For this we store all config classes in a separate config group called schemas. We can use schemas anywhere where we would want to access the configuration class itself.

An example of a more advanced configuration ```yaml default: - schemas/evaluations/classificationaccuracy@evaluations.top5acc - schemas/evaluations/classificationaccuracy@evaluations.top10acc - self

evaluations: top5acc: topk: 5 top10acc: topk: 10 ```

We see that the schema entry in the defaults list defines which object type should be associated with an entry in the evaluations dict, whereas the entries in the config section determine the parameters that should be set of those objects.

Examples for using the Inspector with yaml files can be found in the path config_files/examples.

Below you can find a list of example config names with a short description:

  • advanced_metrics: Shows how you can construct dictionaries of metrics that are fully customizable to your needs.

The Inspector can then be called on these using the command PYTHONPATH=src python bin/run.py --config-dir config_files/examples --config-name <config_name>.

Run the Inspector on a cluster

The Inspector can be easily launched on a cluster leveraging hydra's launcher plugin. For instance, it can be run on a Ray cluster or a SLURM cluster. For more information, check out the Hydra docs.

Inspecting the results

The results are stored in results.json.

Owner

  • Name: Amazon Science
  • Login: amazon-science
  • Kind: organization

GitHub Events

Total
Last Year

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 6
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 6
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • dependabot[bot] (6)
Top Labels
Issue Labels
Pull Request Labels
dependencies (6)