https://github.com/aaltoml/bitvi

Repository for the paper "Approximate Bayesian Inference via Bitstring Representations" published at UAI2025

https://github.com/aaltoml/bitvi

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.2%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Repository for the paper "Approximate Bayesian Inference via Bitstring Representations" published at UAI2025

Basic Info
  • Host: GitHub
  • Owner: AaltoML
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 246 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed 10 months ago

https://github.com/AaltoML/bitvi/blob/main/

# Official code repository for the paper: *Approximate Bayesian Inference via Bitstring Representations*

This repository is the official implementation of the methods in the publication:

* Aleksanteri Sladek, Martin Trapp and Arno Solin (2025). **Approximate Bayesian Inference via Bitstring Representations**. In *Proceedings of the 41st Conference on Uncertainty in Artificial Intelligence (UAI)*. [[OpenReview]](https://openreview.net/forum?id=nbsaJUjHQl) [[Proceedings]](https://proceedings.mlr.press/v286/sladek25a.html) [[arXiv]](https://arxiv.org/abs/2508.13598)

        ![demo.png](demo.png)

## Installing Dependencies

This codebase is primarily written in Python 3. Dependencies can be installed into a new environment via the conda command line tool and the environment.yml file provided, which lists the Python version used and all the required Python packages. A new environment can be created via the following command:

```
conda env create -f environment.yml
```

This command will create a new environment called 'bitvi' with the appropriate Python version and the packages needed to run the code. It can be activated via:

```
conda activate bitvi
```

For further information, see the [conda documentation](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html).

Additionally, elements of this codebase requires installing the [bayesianize](https://github.com/microsoft/bayesianize) Python library. Clone the repository from [https://github.com/microsoft/bayesianize](https://github.com/microsoft/bayesianize) into a folder within the same directory that the bitvi codebase is located, e.g:

```
GitHub
  |
  |-> bitvi
  |     |-> src
  |     |-> scripts 
  .     .
  .     .
  .     .
  |-> bayesianize
  |     | -> bnn
  .     .
  .     .
  .     .
```

### Running the code on a GPU
The codebase is based on the PyTorch Python package, meaning running the code on GPU to speed up execution can be done easily. This may require changing the `environment.yml` file to have conda install the `pytorch-gpu` library (instead of just `pytorch`, as this may install the CPU-only version of the library) and [updating the conda environment](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#updating-an-environment).

## Interactive example
A good place to start with the codebase is a [marimo](https://docs.marimo.io/) notebook with an interactive example of Figure 1 we've provided. This example shows how to initialize a 1D BitVI model, and approximate a mixture density with it. Furthermore it visualizes how the approximation quality changes as the number of bits used for the approximation are changed. This notebook can be found in `notebooks/bitvi_1d_example.py`.


## Running experiments in the paper

Experiments in the paper consist of training the BitVI model on 1D densities, 2D densities and a Bayesian NN on the Moons and Banana classification datasets, and on the UCI benchmark. The sections below give details on how to reproduce these experiments.

### Learning a 1D density
The main entry point for approximating a known and possibly unnormalized 1D density is the Python script `scripts/fit_density_1d.py`. An example of running this script to reproduce the experiment shown in Figure 1 of the paper is provided with the bash script `scripts/figure_1.sh`. In the repository directory, run:
```
./scripts/figure_1.sh
```
Results will be stored in the `experiments` folder, under a sub-folder with the bash script's name.

### Learning a 2D density

The main entry point for approximating a known and possibly unnormalized 2D density is the Python script `scripts/fit_density_2d.py`. An example of running this script to reproduce the experiment shown in Figure 2 of the paper is provided with the bash script `scripts/figure_2.sh`. In the repository directory, run:
```
./scripts/figure_2.sh
```

Running this script will run 4-bit BitVI for a Gaussian mixture, Neal's funnel, Multi-modal Gaussian, Ring and Banana target densities. Results will be stored in the `experiments` folder, under a sub-folder with the bash script's name.


Another example, which also runs Gaussian Full-Covariance VI on the above densities, is given by `scripts/figure_11.sh` which reproduces the results shown in Figure 11.
```
./scripts/figure_11.sh
```

### Approximating a Bayesian Neural Network posterior

The next set of experiments in the paper involved approximating the posterior distribution of a Bayesian Neural Network trained on the Moons and Banana classification tasks, as well as a selection of UCI datasets via the Bayesian Benchmarks library. The following sections detail how these codes can be run.

#### Moons and Banana experiments

The experiment illustrated in Figure 6 consists of a BNN trained on the `moons` dataset with the posterior distribution approximated via BitVI and Fully-factorized Gaussian VI (FFGVI). For further comparison, a standard NN is also trained. These experiments can be recreated by running the script:

```
./scripts/figure_6.sh
```

which will train a BNN with a 2-bit, 4-bit and 8-bit BitVI variational family. It will also run the script for training the FFGVI version, and the regular NN. These functions are performed by the Python scripts `train_bnn_bitvi.py`, `train_bnn_ffgvi.py` and `train_nn.py` scripts in the `scripts` folder.


The 'chopping the banana' experiment illustrated in Figure 8 can be recreated with the `scripts/chop_bits.py` Python script. Running this script requires having previously trained a sufficiently high bit count BitVI model and giving the directory containing this model as the input, since the script will reduce the bits used in the BitVI model iteratively. Running the script `scripts/figure_8.sh` will train a BNN on the `banana` dataset with the same hyperparameters as was used for creating Figure 8. To run it, execute:
```
./scripts/figure_8.sh
```

This will create a directory in the experiments folder `experiments/figure_8/`. Pass this directory as an argument to `chop_bits.py`:

```
python scripts/chop_bits.py --model_dir experiments/figure_8/
```

This will create a new directory in experiments `experiments/chop_bits/` with the decision boundary visualized for different numbers of bits.

#### Entropy experiment

The experiment illustrated in Figure 9 of the paper can be recreated by running:

```
./scripts/figure_9.sh
```

This bash script will execute the file `scripts/smoothness_figure.py`, which trains and visualizes a 16-bit BitVI model for an increasingly complex density function. The script shows the training progress in an interactive window, visualizing the true and approximate density as the top figure and the circuit entropy as a function of the circuit depth (i.e, the number of bits used) in the lower figure.

#### Bayesian Benchmarks experiment

The Bayesian Benchmarks experiment is illustrated in Table 1 of the paper. The experiment consists of training 2-bit, 4-bit, 8-bit BitVI, Full-covariance Gaussian VI and Fully-factorized Gaussian VI models on several datasets. The hyperparameters for each dataset and model combination are defined in the JSON files stored in `src/uci_experiment/econfigs`. These are aggregated into a text file containing list of commands (that can be run on a cluster via SLURM for example) via the `scripts/make_grid.py` script. You can run this script to recreate the commands for the experiments conducted for Table 1 by running:

```
./scripts/table_1.sh
```
The text files will be stored in `src/uci_experiment/econfigs`. The commands in these text files can be executed on a cluster via the provided SLURM script in `scripts/slurm` via (for example): 
```
scripts/slurm/launch.sh src/uci_experiment/econfigs/uci_bitvi_8bit.txt
```

or the commands can be individually via another script you make. Note that you will most likely need to modify the SLURM scripts for your cluster. Each command will run a cross-validated experiment on 5-folds.

Once all the commands generated are run, the results are aggregated using the script `src/uci_experiment/process_results.py`, which generates a file `aggregated_result.json`. This is then processed into a `tex` table via `src/uci_experiment/results_to_latex.py`. Note that this script bolds results in the table using a T-test.

## Codebase structure

The codebase is roughly organized as follows. The `src` folder contains reusable code, such as class and function definitions. The `scripts` folder contains code that utilizes code from `src` to perform various tasks, such as the experiment outlined in the section above. The `data` folder contains data required for running the experiments. Finally, an `experiments` folder is created by many of the scripts for storing experiment results.

### Noteworthy pieces of code
The main contribution of this paper, the deterministic probabilistic circuit (PC) forming the variational family in BitVI, is contained within the file `src/density_models/circuits.py`. This file contains three classes; `TreeCircuit1D`, `TreeCircuitND` and `ParallelTreeCircuit`. 
- The class `TreeCircuit1D` represents a binary tree structured deterministic PC over 1 random variable.
- The class `TreeCircuitND` represents a binary tree structured deterministic PC over N random variables. This structure iteratively cycles through each dimension's bits, and constructs a binary tree of depth `N * num_bits`. Hence, it does not scale well beyond few dimensions and bits. 
- Finally, `ParallelTreeCircuit` is a parallelization of `K` instances of a `TreeCircuit1D`. It is intended for performing mean-field VI for the parameters of a BNN, where `ParallelTreeCircuit` represents the fully-factorized joint distribution over the parameters of a BNN's weight matrix for a single layer.

## Acknowledgements

- This codebase relies on the [bayesianize](https://github.com/microsoft/bayesianize) library by Microsoft for performing mean-field Gaussian VI on Bayesian Neural Networks.

- This codebase utilizes code from the [Bayesian Benchmarks](https://github.com/hughsalimbeni/bayesian_benchmarks) library by Hugh Salimbeni et al. for evaluating our method on various UCI datasets.

- This codebase uses several of the datasets provided in the [UCI Machine Learning Repository](https://archive.ics.uci.edu/) via the [Bayesian Benchmarks](https://github.com/secondmind-labs/bayesian_benchmarks) library.

- This codebase utilizes code from the [improved-hyperparameter-learning](https://github.com/AaltoML/improved-hyperparameter-learning) library by Rui Li et at. for running the Bayesian Benchmarks experiments.

- This codebase utilizes code from the [squared-npcs](https://github.com/april-tools/squared-npcs) library by Lorenzo Loconte et al. for running grid experiments.

## Citation
If you want to cite the paper, you can use the following bibtex entry:
```bash
@InProceedings{sladek2025approximate,
  title = 	 {Approximate Bayesian Inference via Bitstring Representations},
  author =       {Sladek, Aleksanteri and Trapp, Martin and Solin, Arno},
  booktitle = 	 {Proceedings of the 41st Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {3939--3957},
  year = 	 {2025},
  editor = 	 {Chiappa, Silvia and Magliacane, Sara},
  volume = 	 {286},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--25 Jul},
  publisher =    {PMLR}
}

```

## License
This software is provided under the [MIT license](LICENSE), unless otherwise stated. Portions of the software are under the Apache 2.0 License and GPL 3.0 License.

Owner

  • Name: AaltoML
  • Login: AaltoML
  • Kind: organization
  • Location: Finland

Machine learning group at Aalto University lead by Prof. Solin

GitHub Events

Total
  • Member event: 1
  • Push event: 3
  • Create event: 1
Last Year
  • Member event: 1
  • Push event: 3
  • Create event: 1