Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.1%) to scientific vocabulary
Keywords
Repository
Code for Post-hoc Probabilistic Vision-Language Models
Basic Info
Statistics
- Stars: 5
- Watchers: 4
- Forks: 1
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Post-hoc Probabilistic Vision-Language Models

Paper: https://arxiv.org/abs/2412.06014\ Project page: https://aaltoml.github.io/BayesVLM/
Setup Instructions
- Ensure you have Python version
>= 3.11installed. - Install the required packages by running:
bash pip install -r requirements.txt - Set
DATA_BASE_DIRin your.envfile. You can use the structure from the.env.examplefile.DATA_BASE_DIR=/path/to/datasets - Add the project root directory to the
PYTHONPATHenvironment variable.bash export PYTHONPATH=$PYTHONPATH:/path/to/project/root
Running the Code
To run the hessian estimation code, use the following command:
bash
python scripts/hessian_estimation.py
To run the code for zero-shot experiments, use the following command:
bash
python scripts/zeroshot.py
To run the code for the active-learning experiments, use the following command:
bash
python scripts/activelearning.py
Note that each of those commands has additional arguments that allow the adjustment of the Hessian estimation and zero-shot/active learning experiments.
Hessians
The precomputed Hessians for the models used in the paper are available in the hessians/ folder. You can select a specific hessian by setting --hessian_dir in the provided scripts.
Notebooks
A notebook stepping through the zero-shot code is available in notebooks/zeroshot.ipynb.
Data Setup
The data is stored in the DATA_BASE_DIR folder and is structured as follows:
bash
DATA_BASE_DIR/
├── cifar10/
├── cifar100/
├── eurosat/
├── flowers102/
├── food101/
├── homeoffice/
├── imagenet1k/
├── imagenet_r/
├── imagenet_val_wds/
├── laion400m/
├── sun397/
├── ucf101/
Please set the DATA_BASE_DIR environment variable accordingly.
CIFAR-10
The CIFAR-10 dataset is automatically downloaded by the huggingface datasets library.
CIFAR-100
The CIFAR-100 dataset is automatically downloaded by the huggingface datasets library.
EuroSAT
From https://github.com/vishaal27/SuS-X/blob/main/data/DATA.md
- Create a folder named eurosat/ under DATA_BASE_DIR.
- Download the dataset from http://madm.dfki.de/files/sentinel/EuroSAT.zip and extract it to DATA_BASE_DIR/eurosat/.
- Download split_zhou_EuroSAT.json from here and put it under DATA_BASE_DIR/eurosat.
The directory structure should look like
eurosat/
|–– 2750/
|–– split_zhou_EuroSAT.json
Flowers102
The Flowers102 dataset is automatically downloaded by the torchvision library.
Food101
The Food101 dataset is automatically downloaded by the torchvision library.
HomeOffice
Download the dataset from https://www.hemanthdv.org/officeHomeDataset.html and extract it to DATA_BASE_DIR/homeoffice/.
The directory structure should look like
homeoffice/
|–– Art/
|–– Clipart/
|–– Product/
|–– Real World/
|–– ImageInfo.csv
|–– imagelist.txt
Stanford Cars
Follow the instructions https://github.com/pytorch/vision/issues/7545#issuecomment-1631441616 to download the dataset and extract it to DATA_BASE_DIR/stanford_cars/.
DTD
The DTD dataset is automatically downloaded by the torchvision library.
Imagenet Web-Dataset (val)
We supply the script scripts/download_imagenet.py to download all validation tar files for the ImageNet dataset from the Hugging Face Datasets Hub.
After running the script, the directory structure should look like
imagenet_val_wds/
|–– imagenet1k-validation-00.tar
|–– imagenet1k-validation-01.tar
|–– ...
|–– imagenet1k-validation-63.tar
Laion400M
The laion400M dataset can be downloaded using the img2dataset tool. The instructions for the laion400m dataset are available here.
Before running the img2dataset script, we removed all data points marked as NSFW in the metadata.
SUN397
- Create a folder named
sun397/under./data. - Download the images http://vision.princeton.edu/projects/2010/SUN/SUN397.tar.gz.
- Download the partitions https://vision.princeton.edu/projects/2010/SUN/download/Partitions.zip.
- Extract these files under
./data/sun397/. - Download
split_zhou_SUN397.jsonfrom this link and put it under./data/sun397.
The directory structure should look like
sun397/
|–– SUN397/
|–– split_zhou_SUN397.json
|–– ... # a bunch of .txt files
UCF101
- Create a folder named
ucf101/under./data. - Download the zip file
UCF-101-midframes.zipfrom here and extract it to./data/ucf101/. This zip file contains the extracted middle video frames. - Download
split_zhou_UCF101.jsonfrom this link and put it under./data/ucf101.
The directory structure should look like
ucf101/
|–– UCF-101-midframes/
|–– split_zhou_UCF101.json
Citation
bibtex
@article{baumann2024bayesvlm,
title = {Post-hoc Probabilistic Vision-Language Models},
author = {Anton Baumann, Rui Li, Marcus Klasson, Santeri Mentu, Shyamgopal Karthik, Zeynep Akata, Arno Solin and Martin Trapp},
year = {2024},
journal = {arXiv preprint arxiv:2412.06014}
}
License
This software is provided under the MIT license.
Owner
- Name: AaltoML
- Login: AaltoML
- Kind: organization
- Location: Finland
- Website: http://arno.solin.fi
- Repositories: 20
- Profile: https://github.com/AaltoML
Machine learning group at Aalto University lead by Prof. Solin
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software or build upon this work, please cite it as below."
preferred-citation:
type: article
title: "Post-hoc Probabilistic Vision-Language Models"
authors:
- family-names: "Baumann"
given-names: "Anton"
- family-names: "Li"
given-names: "Rui"
- family-names: "Klasson"
given-names: "Marcus"
- family-names: "Mentu"
given-names: "Santeri"
- family-names: "Karthik"
given-names: "Shyamgopal"
- family-names: "Akata"
given-names: "Zeynep"
- family-names: "Solin"
given-names: "Arno"
- family-names: "Trapp"
given-names: "Martin"
journal: "arXiv preprint arxiv:2412.06014"
year: 2024
GitHub Events
Total
- Issues event: 1
- Watch event: 7
- Issue comment event: 3
- Member event: 2
- Push event: 13
- Fork event: 6
- Create event: 2
Last Year
- Issues event: 1
- Watch event: 7
- Issue comment event: 3
- Member event: 2
- Push event: 13
- Fork event: 6
- Create event: 2
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: 8 days
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 2.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: 8 days
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 2.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- Divyanshsingh1910 (1)