PyAutoFit

PyAutoFit: A Classy Probabilistic Programming Language for Model Composition and Fitting - Published in JOSS (2021)

https://github.com/rhayes777/pyautofit

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    3 of 13 committers (23.1%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

astronomy bayesian-inference bayesian-methods graphical-models mcmc model probabilistic-programming python statistical-analysis statistics stats

Keywords from Contributors

gravitational-lenses astrophysics cosmology lens-modeling physics galaxy exoplanets blackhole meshes simulations

Scientific Fields

Engineering Computer Science - 40% confidence
Last synced: 4 months ago · JSON representation ·

Repository

PyAutoFit: Classy Probabilistic Programming

Basic Info
Statistics
  • Stars: 63
  • Watchers: 7
  • Forks: 12
  • Open Issues: 40
  • Releases: 20
Topics
astronomy bayesian-inference bayesian-methods graphical-models mcmc model probabilistic-programming python statistical-analysis statistics stats
Created about 7 years ago · Last pushed 4 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.rst

PyAutoFit: Classy Probabilistic Programming
===========================================

.. |binder| image:: https://mybinder.org/badge_logo.svg
   :target: https://mybinder.org/v2/gh/Jammy2211/autofit_workspace/HEAD

.. |RTD| image:: https://readthedocs.org/projects/pyautofit/badge/?version=latest
    :target: https://pyautofit.readthedocs.io/en/latest/?badge=latest
    :alt: Documentation Status

.. |Tests| image:: https://github.com/rhayes777/PyAutoFit/actions/workflows/main.yml/badge.svg
   :target: https://github.com/rhayes777/PyAutoFit/actions

.. |Build| image:: https://github.com/rhayes777/PyAutoBuild/actions/workflows/release.yml/badge.svg
   :target: https://github.com/rhayes777/PyAutoBuild/actions

.. |JOSS| image:: https://joss.theoj.org/papers/10.21105/joss.02550/status.svg
   :target: https://doi.org/10.21105/joss.02550

|binder| |Tests| |Build| |RTD| |JOSS|

`Installation Guide `_ |
`readthedocs `_ |
`Introduction on Binder `_ |
`HowToFit `_

**PyAutoFit** is a Python based probabilistic programming language for model fitting and Bayesian inference
of large datasets.

The basic **PyAutoFit** API allows us a user to quickly compose a probabilistic model and fit it to data via a
log likelihood function, using a range of non-linear search algorithms (e.g. MCMC, nested sampling).

Users can then set up **PyAutoFit** scientific workflow, which enables streamlined modeling of small
datasets with tools to scale up to large datasets.

**PyAutoFit** supports advanced statistical methods, most
notably `a big data framework for Bayesian hierarchical analysis `_.

Getting Started
---------------

The following links are useful for new starters:

- `The PyAutoFit readthedocs `_, which includes an `installation guide `_ and an overview of **PyAutoFit**'s core features.

- `The introduction Jupyter Notebook on Binder `_, where you can try **PyAutoFit** in a web browser (without installation).

- `The autofit_workspace GitHub repository `_, which includes example scripts and the `HowToFit Jupyter notebook lectures `_ which give new users a step-by-step introduction to **PyAutoFit**.

Support
-------

Support for installation issues, help with Fit modeling and using **PyAutoFit** is available by
`raising an issue on the GitHub issues page `_.

We also offer support on the **PyAutoFit** `Slack channel `_, where we also provide the
latest updates on **PyAutoFit**. Slack is invitation-only, so if you'd like to join send
an `email `_ requesting an invite.

HowToFit
--------

For users less familiar with Bayesian inference and scientific analysis you may wish to read through
the **HowToFits** lectures. These teach you the basic principles of Bayesian inference, with the
content pitched at undergraduate level and above.

A complete overview of the lectures `is provided on the HowToFit readthedocs page `_

API Overview
------------

To illustrate the **PyAutoFit** API, we use an illustrative toy model of fitting a one-dimensional Gaussian to
noisy 1D data. Here's the ``data`` (black) and the model (red) we'll fit:

.. image:: https://raw.githubusercontent.com/rhayes777/PyAutoFit/main/files/toy_model_fit.png
  :width: 400

We define our model, a 1D Gaussian by writing a Python class using the format below:

.. code-block:: python

    class Gaussian:

        def __init__(
            self,
            centre=0.0,        # <- PyAutoFit recognises these
            normalization=0.1, # <- constructor arguments are
            sigma=0.01,        # <- the Gaussian's parameters.
        ):
            self.centre = centre
            self.normalization = normalization
            self.sigma = sigma

        """
        An instance of the Gaussian class will be available during model fitting.

        This method will be used to fit the model to data and compute a likelihood.
        """

        def model_data_from(self, xvalues):

            transformed_xvalues = xvalues - self.centre

            return (self.normalization / (self.sigma * (2.0 * np.pi) ** 0.5)) * \
                    np.exp(-0.5 * (transformed_xvalues / self.sigma) ** 2.0)

**PyAutoFit** recognises that this Gaussian may be treated as a model component whose parameters can be fitted for via
a non-linear search like `emcee `_.

To fit this Gaussian to the ``data`` we create an Analysis object, which gives **PyAutoFit** the ``data`` and a
``log_likelihood_function`` describing how to fit the ``data`` with the model:

.. code-block:: python

    class Analysis(af.Analysis):

        def __init__(self, data, noise_map):

            self.data = data
            self.noise_map = noise_map

        def log_likelihood_function(self, instance):

            """
            The 'instance' that comes into this method is an instance of the Gaussian class
            above, with the parameters set to values chosen by the non-linear search.
            """

            print("Gaussian Instance:")
            print("Centre = ", instance.centre)
            print("normalization = ", instance.normalization)
            print("Sigma = ", instance.sigma)

            """
            We fit the ``data`` with the Gaussian instance, using its
            "model_data_from" function to create the model data.
            """

            xvalues = np.arange(self.data.shape[0])

            model_data = instance.model_data_from(xvalues=xvalues)
            residual_map = self.data - model_data
            chi_squared_map = (residual_map / self.noise_map) ** 2.0
            log_likelihood = -0.5 * sum(chi_squared_map)

            return log_likelihood

We can now fit our model to the ``data`` using a non-linear search:

.. code-block:: python

    model = af.Model(Gaussian)

    analysis = Analysis(data=data, noise_map=noise_map)

    emcee = af.Emcee(nwalkers=50, nsteps=2000)

    result = emcee.fit(model=model, analysis=analysis)

The ``result`` contains information on the model-fit, for example the parameter samples, maximum log likelihood
model and marginalized probability density functions.

Owner

  • Name: Richard Hayes
  • Login: rhayes777
  • Kind: user
  • Location: London
  • Company: RGH Software

I'm a freelance developer with a passion for creating well engineered software.

JOSS Publication

PyAutoFit: A Classy Probabilistic Programming Language for Model Composition and Fitting
Published
February 05, 2021
Volume 6, Issue 58, Page 2550
Authors
James. W. Nightingale ORCID
Institute for Computational Cosmology, Stockton Rd, Durham, United Kingdom, DH1 3LE
Richard G. Hayes
Institute for Computational Cosmology, Stockton Rd, Durham, United Kingdom, DH1 3LE
Matthew Griffiths ORCID
ConcR Ltd, London, UK
Editor
Dan Foreman-Mackey ORCID
Tags
statistics Bayesian inference probabilistic programming model fitting

Citation (CITATIONS.rst)

.. _references:

Citations & References
======================

The bibtex entries for **PyAutoFit** and its affiliated software packages can be found
`here <https://github.com/rhayes777/PyAutoFit/blob/main/files/citations.bib>`_, with example text for citing **PyAutoFit**
in `.tex format here <https://github.com/rhayes777/PyAutoFit/blob/main/files/citation.tex>`_ format here and
`.md format here <https://github.com/rhayes777/PyAutoFit/blob/main/files/citations.md>`_. As shown in the examples, we
would greatly appreciate it if you mention **PyAutoFit** by name and include a link to our GitHub page!

**PyAutoFit** is published in the `Journal of Open Source Software <https://joss.theoj.org/papers/10.21105/joss.02550#>`_ and its
entry in the above .bib file is under the key ``pyautofit``.

Papers & Mentions

Total mentions: 2

Expression, Purification, and Structural Insights for the Human Uric Acid Transporter, GLUT9, Using the Xenopus laevis Oocytes System
Last synced: 2 months ago
Ablation of the Locus Coeruleus Increases Oxidative Stress in Tg-2576 Transgenic but Not Wild-Type Mice
Last synced: 2 months ago

GitHub Events

Total
  • Create event: 72
  • Release event: 3
  • Issues event: 46
  • Watch event: 5
  • Delete event: 70
  • Issue comment event: 77
  • Push event: 163
  • Pull request review comment event: 8
  • Pull request review event: 62
  • Pull request event: 121
  • Fork event: 1
Last Year
  • Create event: 72
  • Release event: 3
  • Issues event: 46
  • Watch event: 5
  • Delete event: 70
  • Issue comment event: 77
  • Push event: 163
  • Pull request review comment event: 8
  • Pull request review event: 62
  • Pull request event: 121
  • Fork event: 1

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 6,482
  • Total Committers: 13
  • Avg Commits per committer: 498.615
  • Development Distribution Score (DDS): 0.355
Past Year
  • Commits: 403
  • Committers: 3
  • Avg Commits per committer: 134.333
  • Development Distribution Score (DDS): 0.278
Top Committers
Name Email Commits
Richard r****7@g****m 4,184
James Nightingale j****e@d****k 1,965
Matthew Griffiths m****w@c****o 252
Jammy2211 J****1@g****m 33
Richard Hayes r****s@R****n 16
Jonathan Frawley j****y@d****k 11
Jacob O. Hjortlund j****d@g****m 11
Dan Foreman-Mackey f****y@g****m 3
dependabot[bot] 4****] 2
Other o****r@O****l 2
Victor Forouhar v****r@d****k 1
Jack O'Donnell j****l@g****m 1
Arfon Smith a****n 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 192
  • Total pull requests: 391
  • Average time to close issues: 6 months
  • Average time to close pull requests: 5 days
  • Total issue authors: 6
  • Total pull request authors: 6
  • Average comments per issue: 1.4
  • Average comments per pull request: 0.8
  • Merged pull requests: 368
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 29
  • Pull requests: 122
  • Average time to close issues: 26 days
  • Average time to close pull requests: 6 days
  • Issue authors: 2
  • Pull request authors: 2
  • Average comments per issue: 0.9
  • Average comments per pull request: 0.84
  • Merged pull requests: 111
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • Jammy2211 (167)
  • rhayes777 (15)
  • VictorForouhar (3)
  • jacob-hjortlund (1)
  • ickc (1)
  • Findus23 (1)
Pull Request Authors
  • rhayes777 (272)
  • Jammy2211 (207)
  • matthewghgriffiths (3)
  • CKrawczyk (2)
  • jhod0 (2)
  • dependabot[bot] (1)
Top Labels
Issue Labels
enhancement (6) bug (2) dependencies (1)
Pull Request Labels
dependencies (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 3,031 last-month
  • Total dependent packages: 5
  • Total dependent repositories: 4
  • Total versions: 304
  • Total maintainers: 2
pypi.org: autofit

Classy Probabilistic Programming

  • Versions: 304
  • Dependent Packages: 5
  • Dependent Repositories: 4
  • Downloads: 3,031 Last month
Rankings
Dependent packages count: 1.6%
Average: 7.2%
Downloads: 7.5%
Dependent repos count: 7.5%
Stargazers count: 9.0%
Forks count: 10.5%
Maintainers (2)
Last synced: 4 months ago