https://github.com/awslabs/fortuna

A Library for Uncertainty Quantification.

Keywords

ai bayesian-inference calibration conformal-prediction deep-learning flax jax machine-learning ml model-calibration neural-networks uncertainty uncertainty-calibration uncertainty-estimation uncertainty-quantification

Keywords from Contributors

interactive projection generic sequences archival gaussian-processes genomics observability autograding hacking

Last synced: 5 months ago · JSON representation

Repository

A Library for Uncertainty Quantification.

Basic Info

Host: GitHub
Owner: awslabs
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://aws-fortuna.readthedocs.io/en/latest/
Size: 4.56 MB

Statistics

Stars: 913
Watchers: 11
Forks: 48
Open Issues: 3
Releases: 0

Archived

Topics

ai bayesian-inference calibration conformal-prediction deep-learning flax jax machine-learning ml model-calibration neural-networks uncertainty uncertainty-calibration uncertainty-estimation uncertainty-quantification

Created over 3 years ago · Last pushed 10 months ago

Metadata Files

Readme Contributing License Code of conduct Security

README.rst

Fortuna
#######

This project is archived and is no longer being actively developed or monitored.
################################################################################

.. image:: https://img.shields.io/pypi/status/Fortuna
    :target: https://img.shields.io/pypi/status/Fortuna
    :alt: PyPI - Status
.. image:: https://img.shields.io/pypi/dm/aws-fortuna
    :target: https://pypistats.org/packages/aws-fortuna
    :alt: PyPI - Downloads
.. image:: https://img.shields.io/pypi/v/aws-fortuna
    :target: https://img.shields.io/pypi/v/aws-fortuna
    :alt: PyPI - Version
.. image:: https://img.shields.io/github/license/awslabs/Fortuna
    :target: https://github.com/awslabs/Fortuna/blob/main/LICENSE
    :alt: License
.. image:: https://readthedocs.org/projects/aws-fortuna/badge/?version=latest
    :target: https://aws-fortuna.readthedocs.io
    :alt: Documentation Status

A Library for Uncertainty Quantification
========================================
Proper estimation of predictive uncertainty is fundamental in applications that involve critical decisions.
Uncertainty can be used to assess reliability of model predictions, trigger human intervention,
or decide whether a model can be safely deployed in the wild.

Fortuna is a library for uncertainty quantification that makes it easy for users to run benchmarks and bring uncertainty to production systems.
Fortuna provides calibration and conformal methods starting from pre-trained models written in any framework,
and it further supports several Bayesian inference methods starting from deep learning models written in `Flax `_.
The language is designed to be intuitive for practitioners unfamiliar with uncertainty quantification,
and is highly configurable.

Check the `documentation `_ for a quickstart, examples and references.

Usage modes
===========
Fortuna offers three different usage modes:
`From uncertainty estimates `_,
`From model outputs `_ and
`From Flax models `_.
These serve users according to the constraints dictated by their own applications.
Their pipelines are depicted in the following figure, each starting from one of the green panels.

.. image:: https://github.com/awslabs/fortuna/raw/main/docs/source/_static/pipeline.png
    :target: https://github.com/awslabs/fortuna/raw/main/docs/source/_static/pipeline.png

From uncertainty estimates
---------------------------
Starting from uncertainty estimates has minimal compatibility requirements and it is the quickest level of interaction with the library.
This usage mode offers conformal prediction methods for both classification and regression.
These take uncertainty estimates in input,
and return rigorous sets of predictions that retain a user-given level of probability.
In one-dimensional regression tasks, conformal sets may be thought as calibrated versions of confidence or credible intervals.

Mind that if the uncertainty estimates that you provide in inputs are inaccurate,
conformal sets might be large and unusable.
For this reason, if your application allows it,
please consider the `From model outputs `_ and
`From Flax models `_ usage modes.

**Example.** Suppose you want to calibrate credible intervals with coverage error :code:`error`,
each corresponding to a different test input variable.
We assume that credible intervals are passed as arrays of lower and upper bounds,
respectively :code:`test_lower_bounds` and :code:`test_upper_bounds`.
You also have lower and upper bounds of credible intervals computed for several validation inputs,
respectively :code:`val_lower_bounds` and :code:`val_upper_bounds`.
The corresponding array of validation targets is denoted by :code:`val_targets`.
The following code produces *conformal prediction intervals*,
i.e. calibrated versions of you test credible intervals.

.. code-block:: python

 from fortuna.conformal import QuantileConformalRegressor
 conformal_intervals = QuantileConformalRegressor().conformal_interval(
      val_lower_bounds=val_lower_bounds, val_upper_bounds=val_upper_bounds,
      test_lower_bounds=test_lower_bounds, test_upper_bounds=test_upper_bounds,
      val_targets=val_targets, error=error)

From model outputs
------------------
Starting from model outputs assumes you have already trained a model in some framework,
and arrive to Fortuna with model outputs in :code:`numpy.ndarray` format for each input data point.
This usage mode allows you to calibrate your model outputs, estimate uncertainty,
compute metrics and obtain conformal sets.

Compared to the `From uncertainty estimates `_ usage mode,
this one offers better control,
as it can make sure uncertainty estimates have been appropriately calibrated.
However, if the model had been trained with classical methods,
the resulting quantification of model (a.k.a. epistemic) uncertainty may be poor.
To mitigate this problem, please consider the `From Flax models `_
usage mode.

**Example.**
Suppose you have validation and test model outputs,
respectively :code:`val_outputs` and :code:`test_outputs`.
Furthermore, you have some arrays of validation and target variables,
respectively :code:`val_targets` and :code:`test_targets`.
The following code provides a minimal classification example to get calibrated predictive entropy estimates.

.. code-block:: python

  from fortuna.output_calib_model import OutputCalibClassifier
  calib_model = OutputCalibClassifier()
  status = calib_model.calibrate(outputs=val_outputs, targets=val_targets)
  test_entropies = calib_model.predictive.entropy(outputs=test_outputs)

From Flax models
--------------------------
Starting from Flax models has higher compatibility requirements than the
`From uncertainty estimates `_
and `From model outputs `_ usage modes,
as it requires deep learning models written in `Flax `_.
However, it enables you to replace standard model training with scalable Bayesian inference procedures,
which may significantly improve the quantification of predictive uncertainty.

**Example.** Suppose you have a Flax classification deep learning model :code:`model` from inputs to logits, with output
dimension given by :code:`output_dim`. Furthermore,
you have some training, validation and calibration TensorFlow data loader :code:`train_data_loader`, :code:`val_data_loader`
and :code:`test_data_loader`, respectively.
The following code provides a minimal classification example to get calibrated probability estimates.

.. code-block:: python

  from fortuna.data import DataLoader
  train_data_loader = DataLoader.from_tensorflow_data_loader(train_data_loader)
  calib_data_loader = DataLoader.from_tensorflow_data_loader(val_data_loader)
  test_data_loader = DataLoader.from_tensorflow_data_loader(test_data_loader)

  from fortuna.prob_model import ProbClassifier
  prob_model = ProbClassifier(model=model)
  status = prob_model.train(train_data_loader=train_data_loader, calib_data_loader=calib_data_loader)
  test_means = prob_model.predictive.mean(inputs_loader=test_data_loader.to_inputs_loader())


Installation
============
**NOTE:** Before installing Fortuna, you are required to `install JAX `_ in your virtual environment.

You can install Fortuna by typing

.. code-block::

    pip install aws-fortuna

Alternatively, you can build the package using `Poetry `_.
If you choose to pursue this way, first install Poetry and add it to your PATH
(see `here `_). Then type

.. code-block::

    poetry install

All the dependencies will be installed at their required versions. Consider adding the following flags to the command above:

- :code:`-E transformers` if you want to use models and datasets from `Hugging Face `_.
- :code:`-E sagemaker` if you want to install the dependencies necessary to run Fortuna on Amazon SageMaker.
- :code:`-E docs` if you want to install Sphinx dependencies to build the documentation.
- :code:`-E notebooks` if you want to work with Jupyter notebooks.

Finally, you can either access the virtualenv that Poetry created by typing :code:`poetry shell`,
or execute commands within the virtualenv using the :code:`run` command, e.g. :code:`poetry run python`.

Examples
========
Several usage examples are found in the
`/examples `_
directory.

Amazon SageMaker integration
============================
We offer a simple pipeline that allows you to run Fortuna on Amazon SageMaker with minimal effort.

1. Create an AWS account - it is free! Store the account ID and the region where you want to launch training jobs.

2. First, `update your local AWS credentials `_.
   Then you need to build and `push a Docker image to an Amazon ECR repository `_.
   This `script `_ will help you doing so -
   it will require your AWS account ID and region. If you need other packages to be included in your Docker image,
   you should consider customize the `Dockerfile `_.
   NOTE: the script has been tested on a M1 MacOS.
   It is possible that different operating systems will need small modifications.

3. Create an `S3 bucket `_.
   You will need this to dump the results from your training jobs on Amazon Sagemaker.

4. Write a configuration `yaml` file. This will include your AWS details, the path to the entrypoint script that you want
   to run on Amazon SageMaker, the arguments to pass to the script, the path to the S3 bucket where you want to dump
   the results, the metrics to monitor, and more.
   Check `this file `_ for an example.

5. Finally, given :code:`config_dir`, that is the absolute path to the main configuration directory,
   and :code:`config_filename`, that is the name of the main configuration file (without .yaml extension),
   enter Python and run the following:

.. code-block:: python

    from fortuna.sagemaker import run_training_job
    run_training_job(config_dir=config_dir, config_filename=config_filename)


Material
========
- `AWS launch blog post `_
- `Fortuna: A Library for Uncertainty Quantification in Deep Learning [arXiv paper] `_

Citing Fortuna
==============
To cite Fortuna:

.. code-block::

    @article{detommaso2023fortuna,
      title={Fortuna: A Library for Uncertainty Quantification in Deep Learning},
      author={Detommaso, Gianluca and Gasparin, Alberto and Donini, Michele and Seeger, Matthias and Wilson, Andrew Gordon and Archambeau, Cedric},
      journal={arXiv preprint arXiv:2302.04019},
      year={2023}
    }

Contributing
============
If you wish to contribute to the project, please refer to our `contribution guidelines `_.


License
=======
This project is licensed under the Apache-2.0 License.
See `LICENSE `_ for more information.

Owner

Name: Amazon Web Services - Labs
Login: awslabs
Kind: organization
Location: Seattle, WA

Website: http://amazon.com/aws/
Repositories: 914
Profile: https://github.com/awslabs

AWS Labs

GitHub Events

Total

Watch event: 38
Delete event: 10
Member event: 1
Issue comment event: 7
Push event: 16
Pull request event: 16
Fork event: 4
Create event: 6

Last Year

Watch event: 38
Delete event: 10
Member event: 1
Issue comment event: 7
Push event: 16
Pull request event: 16
Fork event: 4
Create event: 6

Committers

Last synced: about 2 years ago

All Time

Total Commits: 228
Total Committers: 10
Avg Commits per committer: 22.8
Development Distribution Score (DDS): 0.579

Past Year

Commits: 172
Committers: 8
Avg Commits per committer: 21.5
Development Distribution Score (DDS): 0.442

Top Committers

Name	Email	Commits
Gianluca Detommaso	3****o	96
Gianluca Detommaso	d**a@g**m	87
dependabot[bot]	4****]	23
albertogaspar	a****r	7
Oleg Smirnov	o**v@g**m	6
Thomas Pinder	p**a@a**k	5
Edwin Fong	e**1@g**m	1
Paul Scemama	p**a@g**m	1
Thomas Pinder	t**r@l**k	1
Christian Leibig	4****g	1

Committer Domains (Top 20 + Academic)

live.co.uk: 1 amazon.co.uk: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 11
Total pull requests: 172
Average time to close issues: 9 days
Average time to close pull requests: 12 days
Total issue authors: 5
Total pull request authors: 8
Average comments per issue: 6.0
Average comments per pull request: 0.23
Merged pull requests: 142
Bot issues: 0
Bot pull requests: 48

Past Year

Issues: 1
Pull requests: 23
Average time to close issues: 3 months
Average time to close pull requests: about 1 month
Issue authors: 1
Pull request authors: 3
Average comments per issue: 8.0
Average comments per pull request: 0.78
Merged pull requests: 14
Bot issues: 0
Bot pull requests: 19

View more stats

Top Authors

Issue Authors

PaulScemama (5)
mcapuccini (1)
yucenli (1)
charlesll (1)

Pull Request Authors

gianlucadetommaso (88)
dependabot[bot] (71)
albertogaspar (10)
wistuba (9)
master (8)
mcocheteux (2)
edfong (1)
thomaspinder (1)

Top Labels

Issue Labels

bug (5) enhancement (3)

Pull Request Labels

dependencies (71) enhancement (9) python (4) codebase (1) documentation (1) bug (1)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/awslabs/fortuna

Science Score: 36.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.rst

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies