Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.8%) to scientific vocabulary

Keywords

approximate-bayesian-computation likelihood-free-inference simulation-based-inference summary-statistics
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: tillahoffmann
  • License: bsd-3-clause
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 171 KB
Statistics
  • Stars: 0
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
approximate-bayesian-computation likelihood-free-inference simulation-based-inference summary-statistics
Created over 2 years ago · Last pushed 10 months ago
Metadata Files
Readme License Citation

README.md

Minimizing the Expected Posterior Entropy Yields Optimal Summaries

This repository contains code and data to reproduce the results presented in the manuscript Minimizing the Expected Posterior Entropy Yields Optimal Summaries.

Figures and tables can be regenerated by executing the following steps:

  • Ensure a recent Python version is installed; this code has been tested with Python 3.10 on Ubuntu and macOS.
  • Optionally, create a new virtual environment.
  • Install the Python requirements by executing pip install -r requirements.txt from the root directory of the repository.
  • Install CmdStan by executing python -m cmdstanpy.install_cmdstan --version 2.31.0. Other recent versions of CmdStan may also work but have not been tested.
  • Optionally, verify the installation by executing pytest -v.
  • Execute cook exec "*:evaluation" which will run all experiments and generate evaluation metrics which are saved at workspace/[experiment name]/evaluation.csv.
  • Execute each of the Jupyter notebooks (saved as markdown files) in the notebooks folder to generate the figures.

Results Structure

After running the experiments (see above), the workspace folder contains all results. It is structured as follows, and the folder structure is repeated for each experiment.

python benchmark-large # One folder for each experiment. data # Train, validation, and test split as pickle files; other temp files may also be present. test.pkl train.pkl validation.pkl ... samples # (Approximate) posterior samples as pickle files. [sampler configuration name].pkl ... transformers # Trained transformers, e.g., posterior mean estimators, as pickle files. [transformer configuration name]-[digits].pkl # One of three replications with diff. seeds. [transformer configuration name].pkl # Best transformer amongst the three replications. evaluation.csv # Evaluation of different summary statistic extraction methods. benchmark-small ... coalescent ... tree-large ... tree-large ... figures # Contains PDF figures after executing notebooks.

Each evaluation.csv file has seven columns: - path which refers to one of the methods used to extract summaries. - three columns {nlp,rmise,mise} which are best estimates of negative log probability loss, root mean integrated squared error, and mean integrated squared error, respectively. The estimates are obtained by averaging over all samples in the corresponding test set. - three columns {nlp,rmise,mise}_err which are standard errors obtained as sqrt(var / (n - 1)), where var is the variance of the metric in the test set, and n is the size of the test set.

Owner

  • Name: Till Hoffmann
  • Login: tillahoffmann
  • Kind: user
  • Location: Boston, MA
  • Company: Harvard T.H. Chan School of Public Health

Building network models at @HarvardChanSchool with a focus on open and reproducible science. Formerly @imperial, @spotify.

Citation (CITATION.cff)

# yaml-language-server: $schema=https://citation-file-format.github.io/1.2.0/schema.json
cff-version: 1.2.0
message: If you use this software, please cite it as below.
title: Minimizing the Expected Posterior Entropy Yields Optimal Summary Statistics
url: "https://github.com/onnela-lab/summaries"
authors:
- family-names: Hoffmann
  given-names: Till
  orcid: https://orcid.org/0000-0003-4403-0722
- family-names: Onnela
  given-names: Jukka-Pekka
  orcid: https://orcid.org/0000-0001-6613-8668
preferred-citation:
  type: article
  authors:
  - family-names: Hoffmann
    given-names: Till
    orcid: https://orcid.org/0000-0003-4403-0722
  - family-names: Onnela
    given-names: Jukka-Pekka
    orcid: https://orcid.org/0000-0001-6613-8668
  doi: 10.48550/arXiv.2206.02340
  journal: arXiv
  start: "2206.02340"
  title: Minimizing the Expected Posterior Entropy Yields Optimal Summary Statistics
  year: 2022

GitHub Events

Total
  • Push event: 2
Last Year
  • Push event: 2

Issues and Pull Requests

Last synced: 12 months ago

All Time
  • Total issues: 0
  • Total pull requests: 12
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 day
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 12
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • tillahoffmann (12)
Top Labels
Issue Labels
Pull Request Labels
enhancement (4)