https://github.com/deepskies/deepdiagnostics

Inference diagnostics for mostly SBI

https://github.com/deepskies/deepdiagnostics

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.6%) to scientific vocabulary
Last synced: 5 months ago · JSON representation

Repository

Inference diagnostics for mostly SBI

Basic Info
  • Host: GitHub
  • Owner: deepskies
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 30.3 MB
Statistics
  • Stars: 3
  • Watchers: 7
  • Forks: 3
  • Open Issues: 39
  • Releases: 5
Created about 2 years ago · Last pushed 5 months ago
Metadata Files
Readme License

README.md

PyPI - Version status test Documentation Status

DeepDiagnostics

DeepDiagnostics is a package for diagnosing the posterior from an inference method. It is flexible, applicable for both simulation-based and likelihood-based inference.

Documentation

Read our docs on github.io

Installation

From PyPi

sh pip install deepdiagnostics

From Source

This project is built from poetry, if working from source we recommend using poetry run to run any commands that use the package. You can also use poetry env activate to get the path to the python virtual environment used by poetry. For additional information - please view poetry's environment management documentation.

sh git clone https://github.com/deepskies/DeepDiagnostics/ pip install poetry poetry env activate poetry install poetry run pytest poetry run diagnose --config {config path}

Quickstart

View the template yaml here for a minimally working example with our supplied sample data to get started.

Data and Model Requirements

To access your trained model, use the SBIModel class to load in a trained model in the form of a .pkl file. This format specifics are shown here If you wish to use a different model format, we encourage you to open a new issue requesting it, or even better, write an subclass of deepdiagnostics.models.Model to include it!

To read in your own data, supply an .h5 or .pkl file and specify your format in the data.data_engine field of the configuration file. The possible fields are listed here. We recommend an .h5 file.

The data must have the following fields: * xs - The range of data your parameters have been tested against. For example, if you are modeling y = mx + b, your xs are the values you have tested for x. Please ensure they are of the shape (xsize, nsamples). * thetas - The parameters that characterize your problem. For example, if you are modeling y = mx + b, your thetas are m and b. Please ensure they are in the shape of (nparameters, nsamples) and ordered the same way parameter_labels is supplied in your configuration file to prevent mislabelled plots.

If you do not supply a simulator method, including a ys field can allow for the use of a lookup-table simulator substitute.

Pipeline

DeepDiagnostics includes a CLI tool for analysis. * To run the tool using a configuration file:

sh diagnose --config {path to yaml}

  • To use defaults with specific models and data:

sh diagnose --model_path {model pkl} --data_path {data pkl} [--simulator {sim name}]

Additional arguments can be found using diagnose -h

Standalone

DeepDiagnostics comes with the option to run different plots and metrics independently.

Setting a configuration ahead of time ensures reproducibility with parameters and seeds. It is encouraged, but not required.

``` py from deepdiagnostics.utils.configuration import Config from deepdiagnostics.model import SBIModel from deepdiagnostics.data import H5Data

from deepdiagnostics.plots import LocalTwoSampleTest, Ranks

Config({configurationpath}) model = SBIModel({modelpath}) data = H5Data({data_path}, simulator={simulator name})

LocalTwoSampleTest(data=data, model=model, show=True)(useintensityplot=False, nalphasamples=200) Ranks(data=data, model=model, show=True)(num_bins=3) ```

Contributing

Please view the Deep Skies Lab contributing guidelines before opening a pull request.

DeepDiagnostics is structured so that any new metric or plot can be added by adding a class that is a child of metrics.Metric or plots.Display.

These child classes need a few methods. A minimal example of both a metric and a display is below.

It is strongly encouraged to provide typing for all inputs of the plot and calculate methods so they can be automatically documented.

Please ensure the proxy format DataDisplay is used for all plots, which ensures results can be re-plotted.

Metric

``` py from deepdiagnostics.metrics import Metric

class NewMetric(Metric): """ {What the metric is, any resources or credits.}

.. code-block:: python 

    {a basic example on how to run the metric}
"""
def __init__(self, model, data,out_dir= None, save = True, use_progress_bar = None, samples_per_inference = None, percentiles = None, number_simulations = None,
) -> None:

    # Initialize the parent Metric
    super().__init__(model, data, out_dir, save, use_progress_bar, samples_per_inference, percentiles, number_simulations)

    # Any other calculations that need to be done ahead of time 

def _collect_data_params(self): 
    # Compute anything that needs to be done each time the metric is calculated. 
    return None

def calculate(self, metric_kwargs:dict[str, int]) -> Sequence[int]: 
    """
    Description of the calculations

    Kwargs: 
        metric_kwargs (Required, dict[str, int]): dictionary of the metrics to return, under the name "metric". 

    Returns:
        Sequence[int]: list of the number in metrics_kwargs
    """
    # Where the main calculation takes place, used by the metric __call__. 
    self.output = {'The Result of the calculation'=[metric_kwargs["metric"]]} # Update 'self.output' so the results are saved to the results.json. 

    return [0] # Return the result so the metric can be used standalone. 

```

Display

``` py import matplotlib.pyplot as plt

from deepdiagnostics.plots.plot import Display

class NewPlot(Display): def init( self, model, data, save, show, outdir=None, percentiles = None, useprogressbar= None, samplesperinference = None, numbersimulations= None, parameternames = None, parametercolors = None, colorway =None):

    """
    {Description of the plot}
    .. code-block:: python

        {How to run the plot}
    """

    super().__init__(model, data, save, show, out_dir, percentiles, use_progress_bar, samples_per_inference, number_simulations, parameter_names, parameter_colors, colorway)

def plot_name(self):
    # The name of the plot (the filename, to be saved in out_dir/{file_name})
    # When you run the plot for the first time, it will yell at you if you haven't made this a png path. 
    return "new_plot.png"

def _data_setup(self):
    # When data needs to be run for the plot to work, model inference etc. 
    pass

def plot_settings(self):
    # If there additional settings to pull from the config
    pass

def plot(self, plot_kwarg:float):
    """
    Args:
        plot_kwarg (float, required): Some kwarg
    """
    plt.plot([0,1], [plot_kwarg, plot_kwarg])

```

Adding to the package

If you wish to add the addition to the package to run using the CLI package, a few things need to be done. For this example, we will add a new metric, but an identicial workflow takes place for plots, just modifying the plots submodule instead of metrics.

  1. Add the name and mapping to the submodule __init__.py.
src/deepdiagnostics/metrics/__init__.py

``` py ... from deepdiagnostics.metrics.{your metric file} import NewMetric

Metrics = { ... "NewMetric": NewMetric }

```

  1. Add the name and defaults to the Defaults.py
src/deepdiagnostics/utils/Defaults.py

py Defaults = { "common": {...}, ..., "metrics": { ... "NewMetric": {"default_kwarg": "default overwriting the metric_default in the function definition."} } }

  1. Add a test to the repository, ensure it passes.
tests/test_metrics.py

``` py from deepdiagnostics.metrics import NewMetric

...

def testnewmetric(metricconfig, mockmodel, mockdata): Config(metricconfig) newmetric = NewMetric(mockmodel, mockdata, save=True) expectedresults = {what you should get out} realresults = newmetric.calculate("kwargs that produce the expected results") assert expectedresults.all() == real_results.all()

new_metric()
assert new_metric.output is not None
assert os.path.exists(f"{new_metric.out_dir}/diagnostic_metrics.json")

```

``` console python3 -m pytest tests/testmetrics.py::testnewmetric

```

  1. Add documentation
docs/source/metrics.rst

``` rst from deepdiagnostics.metrics import NewMetric

.. _metrics:

Metrics

.. autoclass:: deepdiagnostics.metrics.metric.Metric :members: ...

.. autoclass:: deepdiagnostics.metrics.newmetric.NewMetric :members: calculate

.. bibliography:: ```

Building documentation:

  • Documentation automatically updates after any push to the main branch according to readthedocs.yml. Verify the documentation built by checking the readthedocs badge.

Publishing a release:

  • Releases to pypi are built automatically off the main branch whenever a github release is made.
  • Update the version number to match with the release you are going to make before publishing in the pyproject.toml
  • Create a new github release and monitor the publish.yml action to verify the new release is built properly.

Citation

``` @article{key , author = {Me :D}, title = {title}, journal = {journal}, volume = {v}, year = {20XX}, number = {X}, pages = {XX--XX} }

```

Acknowledgement

This software has been authored by an employee or employees of Fermi Research Alliance, LLC (FRA), operator of the Fermi National Accelerator Laboratory (Fermilab) under Contract No. DE-AC02-07CH11359 with the U.S. Department of Energy.

Owner

  • Name: Deep Skies Lab
  • Login: deepskies
  • Kind: organization
  • Email: deepskieslab@gmail.com

Building community and making discoveries since 2017

GitHub Events

Total
  • Create event: 7
  • Release event: 2
  • Issues event: 43
  • Watch event: 1
  • Delete event: 4
  • Member event: 1
  • Issue comment event: 62
  • Push event: 23
  • Pull request review comment event: 10
  • Pull request review event: 10
  • Pull request event: 21
  • Fork event: 3
Last Year
  • Create event: 7
  • Release event: 2
  • Issues event: 43
  • Watch event: 1
  • Delete event: 4
  • Member event: 1
  • Issue comment event: 62
  • Push event: 23
  • Pull request review comment event: 10
  • Pull request review event: 10
  • Pull request event: 21
  • Fork event: 3

Issues and Pull Requests

Last synced: 5 months ago

All Time
  • Total issues: 46
  • Total pull requests: 24
  • Average time to close issues: 3 months
  • Average time to close pull requests: 5 days
  • Total issue authors: 5
  • Total pull request authors: 4
  • Average comments per issue: 1.15
  • Average comments per pull request: 0.42
  • Merged pull requests: 15
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 32
  • Pull requests: 15
  • Average time to close issues: 23 days
  • Average time to close pull requests: 9 days
  • Issue authors: 4
  • Pull request authors: 4
  • Average comments per issue: 0.31
  • Average comments per pull request: 0.33
  • Merged pull requests: 6
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • voetberg (25)
  • jsv1206 (8)
  • bnord (7)
  • namrathaurs (3)
  • beckynevin (3)
Pull Request Authors
  • voetberg (20)
  • prasanthcakewalk (2)
  • scarletnorberg (1)
  • jsv1206 (1)
Top Labels
Issue Labels
documentation (10) enhancement (3) To do (3) in progress (3) question (2) High Priority (1) good first issue (1) informational (1)
Pull Request Labels

Dependencies

.github/workflows/lint.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/publish.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/test.yaml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
poetry.lock pypi
  • 128 dependencies
pyproject.toml pypi
  • black ^24.3.0 develop
  • deepbench ^0.2.2 develop
  • flake8 ^7.0.0 develop
  • pre-commit ^3.3.2 develop
  • pytest ^7.3.2 develop
  • pytest-cov ^4.1.0 develop
  • ruff ^0.3.5 develop
  • sphinx ^7.2.6 develop
  • sphinx-autodoc-typehints ^2.2.1 develop
  • sphinxcontrib-bibtex ^2.6.2 develop
  • deprecation ^2.1.0
  • getdist ^1.4.7
  • h5py ^3.10.0
  • matplotlib ^3.8.3
  • numpy >=1.18.5,<1.26.0
  • pyarrow ^15.0.0
  • python >=3.9,<3.12
  • sbi ^0.22.0
  • scipy >=1.6.0, <1.9.2
  • tarp ^0.1.1