GSAreport

GSAreport: Easy to Use Global Sensitivity Reporting - Published in JOSS (2022)

https://github.com/nikivanstein/GSAreport

Science Score: 59.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 7 DOI reference(s) in README
✓
Academic publication links
Links to: ieee.org, joss.theoj.org, zenodo.org
✓
Committers with academic emails
1 of 3 committers (33.3%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (17.4%) to scientific vocabulary

Keywords

explainable-ai machine-learning python sensitivity-analysis

Scientific Fields

Mathematics Computer Science - 40% confidence

Last synced: 6 months ago · JSON representation

Repository

Global Sensitivity reporting for Explainable AI

Basic Info

Host: GitHub
Owner: nikivanstein
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 69 MB

Statistics

Stars: 14
Watchers: 2
Forks: 0
Open Issues: 1
Releases: 5

Topics

explainable-ai machine-learning python sensitivity-analysis

Created about 4 years ago · Last pushed over 1 year ago

Metadata Files

Readme License

Global Sensitivity Analysis Reporting

GSAreport is an application to easily generate reports that describe the global sensitivities of your input parameters as best as possible. You can use the reporting application to inspect which features are important for a given real world function / simulator or model. Using the dockerized application you can generate a report with just one line of code and no additional dependencies (except for Docker of course).

Global Sensitivity Analysis is one of the tools to better understand your machine learning models or get an understanding in real-world processes.

What is Sensitivity Analysis?

According to Wikipedia, sensitivity analysis is "the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be apportioned to different sources of uncertainty in its inputs." The sensitivity of each input is often represented by a numeric value, called the sensitivity index. Sensitivity indices come in several forms:

First-order indices: measures the contribution to the output variance by a single model input alone.
Second-order indices: measures the contribution to the output variance caused by the interaction of two model inputs.
Total-order index: measures the contribution to the output variance caused by a model input, including both its first-order effects (the input varying alone) and all higher-order interactions.

Sensitivity Analysis is a great way of getting a better understanding of how machine learning models work (Explainable AI), what parameters are of importance in real-world applications and processes and what interactions parameters have with other parameters.
GSAreport makes it easy to run a wide set of SA techniques and generates a nice and visually attractive report to inspect the results of these techniques. By using Docker no additional software needs to be installed and no coding experience is required.

Report example — For a full example report seehere.

Downloading and setting up the source code

You can also use the python package by installing the dependencies on your own system.

Install graph-tool (https://graph-tool.skewed.de/)
Install python 3.7+
Install node (v14+)
Clone the repository with git or download the zip
Install all python requirements (pip install -r src/requirements.txt)
Run python src/GSAreport.py -h

How to use

Generate a global sensitivity analysis report for a given data set or function with the simple Docker / python or executable command options.

To start, you always need to provide the program with a problem definition. This definition can be supplied as json file, see also data/problem.json for an example. The problem definition contains the dimensionality of your problem (number of input variables) num_vars, the names of these variables (X0 to X4 in the example), and the bounds of each variables as a list of tuples (lower bound, upper bound).

```python

Example problem definition in python (you can store this dict using json.dump to a json file)

dim = 5 problem = { 'num_vars': dim, 'names': ['X'+str(x) for x in range(dim)], 'bounds': [[-5.0, 5.0]] * dim } ```

Once you have the problem definition (specify it with -p path/to/problem.json) you can directly load an existing data set containing input and output files for analysis by passing the path to the directory (with -d <path>) in which these files are stored. The application searches for the following csv files:

x.csv, y.csv #optional, in case you use an existing design of experiments
xsobol.csv, ysobol.csv
xmorris.csv, ymorris.csv
xlhs.csv, ylhs.csv

Sample csv files can be found in the /data/ directory of this repository. Sample files can also be generated with the --demo parameter.

When you have your own design of experiments you can store these in x and y.csv (space delimited). The Sobol, Morris and LHS (Latin Hypercube Sampling) files can be used when you have samples and results from a specific sampling technique which can be used for different Sensitivity analysis algorithms. The GSA report application can generate the x_ version of these files (the input). Using the input files you can then evaluate the data points and store the target values y in the csv file with the same name convention. If you only provide an x.csv and y.csv file, a machine learning algorithm will be used to interpolate the remaining samples to generate the appropriate design of experiments required for the sensitivity analysis.

A python example to read the x_*.csv files and produce the correspondig y_*.csv files using your own objective function is provided in the next section.

Testing the Installation

Run

zsh cd src python -m pytest

To execute the automated tests to verify the installation.

Building binaries (for developers)

If you want to build the executables yourself you can use the following commands. We use pyinstaller to package the executables. Make sure you have pyinstaller installed using pip install pyinstaller.

On your operating system, build the exe once you have the python source code up and running:

zsh pyinstaller --distpath dist/darwin/ GSAreport.spec

We provide binaries for Linux and Mac-OS in the releases section.

To generate a new version of the documentation run mike deploy --push --update-aliases 1.3.0 latest

References

This tool uses Savvy [1] and SALib [2].

[1] Hough, B., Fu, C. and Paliwal, S. (2016). savvy: visualize high dimensionality sensitivity analysis data. Updated with full sensitivity analysis from ligpy model. (v2.0). Zenodo. https://doi.org/10.5281/zenodo.53099
[2] Herman, J. and Usher, W. (2017) SALib: An open-source Python library for sensitivity analysis. Journal of Open Source Software, 2(9). doi:10.21105/joss.00097

Cite our paper

Use the following bibtex to cite our paper when you use GSAreport.

@ARTICLE{9903639, author={Stein, Niki van and Raponi, Elena and Sadeghi, Zahra and Bouman, Niek and Van Ham, Roeland C. H. J. and Bäck, Thomas}, journal={IEEE Access}, title={A Comparison of Global Sensitivity Analysis Methods for Explainable AI With an Application in Genomic Prediction}, year={2022}, volume={10}, number={}, pages={103364-103381}, doi={10.1109/ACCESS.2022.3210175} }

Owner

Name: Niki van Stein
Login: nikivanstein
Kind: user
Location: Leiden, the Netherlands
Company: Univerity Leiden

Website: https://nikivanstein.nl
Repositories: 1
Profile: https://github.com/nikivanstein

Researcher at University Leiden and cofounder of @Van-Stein-Groentjes-B-V

GitHub Events

Total

Issues event: 1
Watch event: 5

Last Year

Issues event: 1
Watch event: 5

Committers

Last synced: 7 months ago

All Time

Total Commits: 153
Total Committers: 3
Avg Commits per committer: 51.0
Development Distribution Score (DDS): 0.078

Past Year

Commits: 1
Committers: 1
Avg Commits per committer: 1.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Bas van Stein	b**2@g**m	141
Bas van Stein	s**n@l**l	10
Elena	7****i	2

Committer Domains (Top 20 + Academic)

liacs.leidenuniv.nl: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 10
Total pull requests: 20
Average time to close issues: 2 days
Average time to close pull requests: 1 day
Total issue authors: 4
Total pull request authors: 1
Average comments per issue: 1.8
Average comments per pull request: 0.05
Merged pull requests: 20
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

xtruan (4)
elenaraponi (3)
rreinecke (2)
Saran-nns (1)

Pull Request Authors

nikivanstein (24)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

doc-requirements.txt pypi

mike *
mkdocs ==1.2.3
mkdocs-material *
mkgendocs ==0.9.0

paper experiments/requirements.txt pypi

IProgress ==0.4
Jinja2 ==3.0.3
MarkupSafe ==2.0.1
Pillow ==9.0.0
Pygments ==2.11.2
SALib ==1.4.5
Send2Trash ==1.8.0
appnope ==0.1.2
argon2-cffi ==21.3.0
argon2-cffi-bindings ==21.2.0
attrs ==21.4.0
backcall ==0.2.0
bleach ==4.1.0
cffi ==1.15.0
colorama ==0.4.4
cycler ==0.11.0
deap ==1.3.1
debugpy ==1.5.1
decorator ==5.1.1
defusedxml ==0.7.1
dill ==0.3.4
entrypoints ==0.3
fonttools ==4.28.5
importlib-resources ==5.4.0
ipykernel ==6.6.1
ipython ==7.31.0
ipython-genutils ==0.2.0
ipywidgets ==7.6.5
jedi ==0.18.1
joblib ==1.1.0
jsonschema ==4.4.0
jupyter-client ==7.1.0
jupyter-core ==4.9.1
jupyterlab-pygments ==0.1.2
jupyterlab-widgets ==1.0.2
kiwisolver ==1.3.2
matplotlib ==3.5.1
matplotlib-inline ==0.1.3
mistune ==0.8.4
multiprocess ==0.70.12.2
nbclient ==0.5.10
nbconvert ==6.4.0
nbformat ==5.1.3
nest-asyncio ==1.5.4
notebook ==6.4.7
numpy ==1.22.0
packaging ==21.3
pandas ==1.3.5
pandocfilters ==1.5.0
parso ==0.8.3
pathos ==0.2.8
pexpect ==4.8.0
pickleshare ==0.7.5
pox ==0.3.0
ppft ==1.6.6.4
prometheus-client ==0.12.0
prompt-toolkit ==3.0.24
ptyprocess ==0.7.0
pycparser ==2.21
pyparsing ==3.0.6
pyrsistent ==0.18.1
python-dateutil ==2.8.2
pytz ==2021.3
pyzmq ==22.3.0
scikit-learn ==1.0.2
scipy ==1.7.3
seaborn ==0.11.2
six ==1.16.0
sklearn ==0.0
terminado ==0.12.1
testpath ==0.5.0
threadpoolctl ==3.0.0
tornado ==6.1
tqdm ==4.62.3
traitlets ==5.1.1
wcwidth ==0.2.5
webencodings ==0.5.1
widgetsnbextension ==3.5.2
zipp ==3.7.0

src/requirements.txt pypi

CacheControl ==0.12.6
Jinja2 ==3.0.3
MarkupSafe ==2.0.1
Pillow ==8.4.0
PyGObject ==3.42.0
PyYAML ==6.0
Pygments ==2.11.1
SALib ==1.4.5
Send2Trash ==1.8.0
appdirs ==1.4.4
argon2-cffi ==21.1.0
async-generator ==1.10
attrs ==21.3.0
backcall ==0.2.0
black ==22.3.0
bleach ==4.1.0
bokeh ==2.4.2
cairocffi ==1.3.0
cffi ==1.15.0
chardet ==4.0.0
cloudpickle ==2.1.0
colorama ==0.4.4
contextlib2 ==0.6.0.post1
cryptography ==36.0.1
cycler ==0.10.0
deap ==1.3.1
debugpy ==1.5.1
decorator ==5.1.0
defusedxml ==0.7.1
dill ==0.3.4
distlib ==0.3.3
distro ==1.6.0
entrypoints ==0.3
fonttools ==4.28.5
html5lib ==1.1
idna ==3.3
importlib-metadata ==4.8.1
ipykernel ==6.6.0
ipython ==7.30.1
ipython-genutils ==0.2.0
ipywidgets ==7.6.5
isort ==5.10.1
jedi ==0.18.1
joblib ==1.1.0
jsonschema ==3.2.0
jupyter-client ==7.1.0
jupyter-console ==6.4.0
jupyter-core ==4.9.1
jupyterlab-pygments ==0.1.2
kiwisolver ==1.3.2
llvmlite ==0.38.1
matplotlib ==3.5.1
matplotlib-inline ==0.1.3
mistune ==0.8.4
more-itertools ==8.10.0
msgpack ==1.0.3
multiprocess ==0.70.12.2
nbclient ==0.5.4
nbconvert ==6.1.0
nbformat ==5.1.3
nest-asyncio ==1.5.4
notebook ==6.4.4
numba ==0.55.2
numpy ==1.21.5
ordered-set ==4.0.2
packaging ==21.3
pandas ==1.3.5
pandas-bokeh ==0.5.5
pandocfilters ==1.4.3
parso ==0.8.2
pathos ==0.2.8
pep517 ==0.12.0
pexpect ==4.8.0
pickleshare ==0.7.5
ply ==3.11
pox ==0.3.0
ppft ==1.6.6.4
progress ==1.6
prometheus-client ==0.11.0
prompt-toolkit ==3.0.24
ptyprocess ==0.7.0
pyOpenSSL ==21.0.0
pycairo ==1.20.1
pycparser ==2.21
pyparsing ==2.4.7
pyrsistent ==0.18.0
python-dateutil ==2.8.2
pytz ==2021.3
pyzmq ==22.2.1
requests ==2.26.0
resolvelib ==0.5.5
retrying ==1.3.3
scikit-learn ==1.0.2
scipy ==1.7.3
shap ==0.40.0
six ==1.16.0
sklearn ==0.0
slicer ==0.0.7
terminado ==0.11.1
testpath ==0.5.0
threadpoolctl ==3.1.0
toml ==0.10.2
tomli ==1.2.2
tornado ==6.1
tqdm ==4.64.0
traitlets ==5.1.0
typing-extensions ==4.2.0
urllib3 ==1.26.7
wcwidth ==0.2.5
webencodings ==0.5.1
widgetsnbextension ==3.5.1
xcffib ==0.11.1
zipp ==3.6.0
zstandard ==0.16.0

.github/workflows/linter.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/package.yml actions

actions/checkout v3 composite
docker/build-push-action ad44023a93711e3deb337508980b4b5e9bcdc5dc composite
docker/login-action f054a8b539a109f9f41c372932f1ae047eff08c9 composite
docker/metadata-action 98669ae865ea3cffbcbaa878cf57c20bbf1c6c38 composite

pyproject.toml pypi

SALib *
bokeh *
matplotlib *
numpy *
pandas *
scikit-learn *
shap *

.github/workflows/publish.yml actions

actions/checkout v3 composite
actions/setup-python v3 composite
pypa/gh-action-pypi-publish release/v1 composite

GSAreport

Science Score: 59.0%

Keywords

Scientific Fields

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Global Sensitivity Analysis Reporting

What is Sensitivity Analysis?

Downloading and setting up the source code

How to use

Example problem definition in python (you can store this dict using json.dump to a json file)

Testing the Installation

Building binaries (for developers)

References

Cite our paper

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies