getdist

MCMC sample analysis, kernel densities, plotting, and GUI

https://github.com/cmbant/getdist

Science Score: 59.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    4 of 14 committers (28.6%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.6%) to scientific vocabulary

Keywords

contour-plot kernel-density-estimation mcmc plotting-in-python sampling-methods statistical-inference

Keywords from Contributors

cosmology mpi
Last synced: 6 months ago · JSON representation

Repository

MCMC sample analysis, kernel densities, plotting, and GUI

Basic Info
  • Host: GitHub
  • Owner: cmbant
  • License: other
  • Language: Python
  • Default Branch: master
  • Size: 85.5 MB
Statistics
  • Stars: 165
  • Watchers: 9
  • Forks: 56
  • Open Issues: 0
  • Releases: 19
Topics
contour-plot kernel-density-estimation mcmc plotting-in-python sampling-methods statistical-inference
Created almost 11 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing Codemeta

README.rst

===================
GetDist
===================
:GetDist: MCMC sample analysis, plotting and GUI
:Author: Antony Lewis
:Homepage: https://getdist.readthedocs.io
:Source: https://github.com/cmbant/getdist
:Reference: https://arxiv.org/abs/1910.13970

.. image:: https://github.com/cmbant/getdist/actions/workflows/tests.yml/badge.svg
   :target: https://github.com/cmbant/getdist/actions/workflows/tests.yml
.. image:: https://img.shields.io/pypi/v/GetDist.svg?style=flat
   :target: https://pypi.python.org/pypi/GetDist/
.. image:: https://readthedocs.org/projects/getdist/badge/?version=latest
   :target: https://getdist.readthedocs.io/en/latest
.. image:: https://mybinder.org/badge_logo.svg
   :target: https://mybinder.org/v2/gh/cmbant/getdist/master?filepath=docs%2Fplot_gallery.ipynb
.. image:: https://img.shields.io/badge/arXiv-1910.13970-b31b1b.svg?color=0B6523
   :target: https://arxiv.org/abs/1910.13970

Description
============

GetDist is a Python package for analysing Monte Carlo samples, including correlated samples
from Markov Chain Monte Carlo (MCMC).

* **Point and click GUI** - select chain files, view plots, marginalized constraints, LaTeX tables and more (Qt-based desktop app and Streamlit web interface)
* **Plotting library** - make custom publication-ready 1D, 2D, 3D-scatter, triangle and other plots
* **Named parameters** - simple handling of many parameters using parameter names, including LaTeX labels and prior bounds
* **Optimized Kernel Density Estimation** - automated optimal bandwidth choice for 1D and 2D densities (Botev et al. Improved Sheather-Jones method), with boundary and bias correction
* **Convergence diagnostics** - including correlation length and diagonalized Gelman-Rubin statistics
* **LaTeX tables** for marginalized 1D constraints

See the `Plot Gallery and tutorial `_
(`run online `_)
and `GetDist Documentation `_.


Getting Started
================

Install getdist using pip::

    $ pip install getdist

or from source files using::

    $ pip install -e /path/to/source/

You can test if things are working using the unit test by running::

    $ python -m unittest getdist.tests.getdist_test

Check the dependencies listed in the next section are installed. You can then use the getdist module from your scripts, or
use the GetDist GUI (*getdist-gui* command).

Once installed, the best way to get up to speed is probably to read through
the `Plot Gallery and tutorial `_.

Dependencies
=============
* Python 3.10+
* matplotlib
* scipy
* PySide6 - optional, only needed for Qt-based GUI
* Streamlit - optional, only needed for web-based GUI
* Working LaTeX installation (not essential, only for some plotting/table functions)

Python distributions like Anaconda have most of what you need (except for LaTeX).

To use the Qt-based `GUI `_ you need PySide6.
To use the Streamlit web interface, you need Streamlit.
See the `GUI docs `_ for suggestions on how to install both.

Algorithm details
==================

Details of kernel density estimation (KDE) algorithms and references are give in the GetDist notes
`arXiv:1910.13970 `_.

Samples file format
===================

GetDist can be used in scripts and interactively with standard numpy arrays
(as in the `examples `_).

Scripts and the `GetDist GUI `_ can also read parameter sample/chain files in plain text format
(or in the format output by the `Cobaya `__ sampling program.

Plain text sample files are of the form::

  xxx_1.txt
  xxx_2.txt
  ...
  xxx.paramnames
  xxx.ranges

where "xxx" is some root file name.

The .txt files are separate chain files (there can also be just one xxx.txt file). Each row of each sample .txt file is in the format

  *weight like param1 param2 param3* ...

The *weight* gives the number of samples (or importance weight) with these parameters. *like* gives -log(posterior), and *param1, param2...* are the values of the parameters at the sample point. The first two columns can be 1 and 0 if they are not known or used.

The .paramnames file lists the names of the parameters, one per line, optionally followed by a LaTeX label. Names cannot include spaces, and if they end in "*" they are interpreted as derived (rather than MCMC) parameters, e.g.::

 x1   x_1
 y1   y_1
 x2   x_2
 xy*  x_1+y_1

The .ranges file gives hard bounds for the parameters, e.g.::

 x1  -5 5
 x2   0 N

Note that not all parameters need to be specified, and "N" can be used to denote that a particular upper or lower limit is unbounded. The ranges are used to determine densities and plot bounds if there are samples near the boundary; if there are no samples anywhere near the boundary the ranges have no affect on plot bounds, which are chosen appropriately for the range of the samples.

There can also optionally be a .properties.ini file, which can specify *burn_removed=T* to ensure no burn in is removed, or *ignore_rows=x* to ignore the first
fraction *x* of the file rows (or if *x > 1*, the specified number of rows).

Loading samples
===================

To load an MCSamples object from text files do::

     from getdist import loadMCSamples
     samples = loadMCSamples('/path/to/xxx', settings={'ignore_rows':0.3})

Here *settings* gives optional parameter settings for the analysis. *ignore_rows* is useful for MCMC chains where you want to
discard some fraction from the start of each chain as burn in (use a number >1 to discard a fixed number of sample lines rather than a fraction).
The MCSamples object can be passed to plot functions, or used to get many results. For example, to plot marginalized parameter densities
for parameter names *x1* and *x2*::

    from getdist import plots
    g = plots.get_single_plotter()
    g.plot_2d(samples, ['x1', 'x2'])

When you have many different chain files in the same directory,
plotting can work directly with the root file names. For example to compare *x* and *y* constraints
from two chains with root names *xxx* and *yyy*::

    from getdist import plots
    g = plots.get_single_plotter(chain_dir='/path/to/', analysis_settings={'ignore_rows':0.3})
    g.plot_2d(['xxx','yyy'], ['x', 'y'])


MCSamples objects can also be constructed directly from numpy arrays in memory, see the example
in the `Plot Gallery `_,
and from, `ArviZ, PyMC and other sampler formats `_.

GetDist script
===================

If you have chain files on on disk, you can also quickly calculate convergence and marginalized statistics using the *getdist* script:

    usage: getdist [-h] [--ignore_rows IGNORE_ROWS] [-V] [ini_file] [chain_root]

    GetDist sample analyser

    positional arguments:
      *ini_file*              .ini file with analysis settings (optional, if omitted uses defaults

      *chain_root*            Root name of chain to analyse (e.g. chains/test), required unless file_root specified in ini_file

    optional arguments:
      -h, --help            show this help message and exit
      --ignore_rows IGNORE_ROWS
                            set initial fraction of chains to cut as burn in
                            (fraction of total rows, or >1 number of rows);
                            overrides any value in ini_file if set
      --make_param_file MAKE_PARAM_FILE
                        Produce a sample distparams.ini file that you can edit
                        and use when running GetDist
      -V, --version         show program's version number and exit

where *ini_file* is optionally a .ini file listing *key=value* parameter option values, and chain_root is the root file name of the chains.
For example::

   getdist distparams.ini chains/test_chain

This produces a set of files containing parameter means and limits (.margestats), N-D likelihood contour boundaries and best-fit sample (.likestats),
convergence diagnostics (.converge), parameter covariance and correlation (.covmat and .corr), and optionally various simple plotting scripts.
If no *ini_file* is given, default settings are used. The *ignore_rows* option allows some of the start of each chain file to be removed as burn in.

To customize settings you can run::

   getdist --make_param_file distparams.ini

to produce the setting file distparams.ini, edit it, then run with your custom settings.

GetDist GUI
===================

GetDist provides two graphical user interfaces:

1. **Qt-based Desktop App**: Run *getdist-gui* to use the traditional desktop interface. This requires PySide6 to be installed.

2. **Streamlit Web Interface**: Run *getdist-streamlit* to use the browser-based interface. This requires Streamlit to be installed.

Both interfaces allow you to open a folder of chain files, then easily select, open, plot and compare, as well as viewing standard GetDist outputs and tables.

You can also try the Streamlit interface online at ``_ (with fixed example chains).

See the `GUI Documentation `_ for more details on both interfaces.


Using with CosmoMC and Cobaya
=============================

This GetDist package is general, but is mainly developed for analysing chains from the `CosmoMC `_
and `Cobaya `_ sampling programs.
No need to install this package separately if you have a full CosmoMC installation; the Cobaya installation will also install GetDist as a dependency.
Detailed help is available for plotting Planck chains
and using CosmoMC parameter grids in the `Readme `_.

Citation
===================
You can refer to the JCAP paper::

      @article{Lewis:2019xzd,
         author = "Lewis, Antony",
         title = "{GetDist: a Python package for analysing Monte Carlo samples}",
         eprint = "1910.13970",
         archivePrefix = "arXiv",
         primaryClass = "astro-ph.IM",
         doi = "10.1088/1475-7516/2025/08/025",
         journal = "JCAP",
         volume = "08",
         pages = "025",
         year = "2025"
      }

and references therein as appropriate.

LLM Integration
===================
For AI assistants and LLM agents working with GetDist, a single-file context document is available at `GetDist LLM Context `_. This document provides a comprehensive overview of GetDist's functionality, common usage patterns, and best practices in a format optimized for LLM context windows.

Contributing
===================
Please see the `Contributing Guide `_.

===================

.. image:: https://raw.githubusercontent.com/CobayaSampler/cobaya/master/img/Sussex_white.svg
   :alt: University of Sussex
   :target: https://www.sussex.ac.uk/astronomy/
   :height: 200px
   :width: 200px

.. image:: https://raw.githubusercontent.com/CobayaSampler/cobaya/master/img/ERC_white.svg
   :alt: European Research Council
   :target: https://erc.europa.eu/
   :height: 200px
   :width: 200px

.. image:: https://cdn.cosmologist.info/antony/STFC_white.svg
   :alt: STFC
   :target: https://stfc.ukri.org/
   :height: 200px
   :width: 200px

Owner

  • Name: Antony Lewis
  • Login: cmbant
  • Kind: user
  • Company: University of Sussex

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "name": "GetDist: Monte Carlo sample analysis, plotting and GUI",
  "description": "Analysis and plotting of Monte Carlo (and other) samples, including correlated samples from Markov Chain Monte Carlo (MCMC). It offers a point and click GUI for selecting chain files, viewing plots, marginalized constraints, and LaTeX tables, and includes a plotting library for making custom publication-ready 1D, 2D, 3D-scatter, triangle and other plots. Its convergence diagnostics include correlation length and diagonalized Gelman-Rubin statistics, and the optimized kernel density estimation provides an automated optimal bandwidth choice for 1D and 2D densities with boundary and bias correction.",
  "identifier": "https://dx.doi.org/10.5281/zenodo.3522420",
  "author": [
    {
      "@type": "Person",
      "givenName": "Antony",
      "familyName": "Lewis",
      "id": "https://orcid.org/0000-0001-5927-6667"
    }
  ],
  "citation": "https://arxiv.org/abs/1910.13970",
  "relatedLink": [
    "https://getdist.readthedocs.io/"
  ],
  "codeRepository": "https://github.com/cmbant/getdist",
  "programmingLanguage": "Python",
  "referencePublication": [
    {
      "@type": "ScholarlyArticle",
      "url": "https://arxiv.org/abs/1910.13970",
      "id": "arXiv:1910.13970"
    }
  ],
  "version": "1",
  "license": "https://github.com/cmbant/getdist/blob/master/LICENCE.txt"
}

GitHub Events

Total
  • Create event: 15
  • Release event: 4
  • Issues event: 11
  • Watch event: 42
  • Delete event: 2
  • Issue comment event: 15
  • Push event: 89
  • Pull request review event: 2
  • Pull request event: 2
  • Fork event: 8
Last Year
  • Create event: 15
  • Release event: 4
  • Issues event: 11
  • Watch event: 42
  • Delete event: 2
  • Issue comment event: 15
  • Push event: 89
  • Pull request review event: 2
  • Pull request event: 2
  • Fork event: 8

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 449
  • Total Committers: 14
  • Avg Commits per committer: 32.071
  • Development Distribution Score (DDS): 0.105
Past Year
  • Commits: 89
  • Committers: 1
  • Avg Commits per committer: 89.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Antony Lewis a****y@c****o 402
Antony Lewis a****y@c****t 20
Jesús Torrado J****o 6
Will Handley w****0@c****k 4
Chris Ringeval e****t@p****m 4
mraveri m****i@s****u 2
Xavier Garrido x****o@g****m 2
Mathew S. Madhavacheril m****c@g****m 2
chris c****s@r****m 2
xyh-cosmo y****u@n****n 1
Tuomo Salmi t****i@u****l 1
Nathan Musoke n****e@g****m 1
Edward Higson e****n@m****k 1
DolgikhKA d****5@p****u 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 85
  • Total pull requests: 42
  • Average time to close issues: 2 months
  • Average time to close pull requests: 18 days
  • Total issue authors: 65
  • Total pull request authors: 13
  • Average comments per issue: 3.45
  • Average comments per pull request: 1.33
  • Merged pull requests: 30
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 6
  • Pull requests: 3
  • Average time to close issues: 1 day
  • Average time to close pull requests: 17 days
  • Issue authors: 6
  • Pull request authors: 1
  • Average comments per issue: 2.67
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • taladi (4)
  • vivianmiranda (3)
  • s-ilic (3)
  • eatdust (3)
  • lukashergt (2)
  • HengamehB (2)
  • borisbolliet (2)
  • bhorowitz (2)
  • inigozubeldia (2)
  • am610 (2)
  • potassium-chloride (2)
  • IamSreeman (2)
  • SimoneAmmazzalorso (2)
  • cmbant (2)
  • williamjameshandley (2)
Pull Request Authors
  • JesusTorrado (11)
  • cmbant (10)
  • eatdust (5)
  • msyriac (3)
  • williamjameshandley (3)
  • xyh-cosmo (2)
  • mraveri (2)
  • xgarrido (2)
  • potassium-chloride (1)
  • ejhigson (1)
  • surhudm (1)
  • thjsal (1)
  • musoke (1)
Top Labels
Issue Labels
enhancement (2)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 8,992 last-month
  • Total dependent packages: 14
    (may contain duplicates)
  • Total dependent repositories: 24
    (may contain duplicates)
  • Total versions: 65
  • Total maintainers: 1
pypi.org: getdist

GetDist Monte Carlo sample analysis, plotting and GUI

  • Versions: 62
  • Dependent Packages: 14
  • Dependent Repositories: 23
  • Downloads: 8,992 Last month
Rankings
Dependent packages count: 2.1%
Dependent repos count: 3.1%
Downloads: 3.7%
Average: 4.3%
Forks count: 6.1%
Stargazers count: 6.7%
Maintainers (1)
Last synced: 6 months ago
conda-forge.org: getdist

GetDist is a Python package for analysing Monte Carlo samples, including correlated samples from Markov Chain Monte Carlo (MCMC).

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 1
Rankings
Dependent repos count: 24.1%
Forks count: 26.2%
Stargazers count: 32.6%
Average: 33.6%
Dependent packages count: 51.5%
Last synced: 6 months ago