Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.6%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: schwallergroup
- License: other
- Language: Python
- Default Branch: main
- Homepage: https://arxiv.org/abs/2504.06265
- Size: 344 KB
Statistics
- Stars: 9
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
GOLLuM: Gaussian Process Optimized LLMs – Reframing LLMs as Principled Bayesian Optimizers 🧙♂️📈
GOLLuM – Gaussian Process Optimized LLMs are here!
One representation to rule them all!
🔍 Overview
🎯 GOLLuM addresses the challenge of harnessing LLMs for optimization under uncertainty by introducing:
- LLM-based deep kernels, jointly optimized with GPs to preserve the benefits of both
- LLMs to provide a rich and flexible input space for Bayesian optimization
- GPs to model this space with predictive uncertainty for more efficient sampling
🌌 The framework enables a bidirectional feedback loop: 1. The GP guides updates to LLM weights to produce more effective embeddings 2. These embeddings enhance the GP's probabilistic modeling
🧠 Key Features
- Unified Representation Learning: Uses textual templates to represent heterogeneous parameter types (categorical, numerical, structural)
- GP-Guided LLM Finetuning: Optimizes LLM embeddings through GP marginal likelihood
- Implicit Contrastive Learning: Automatically organizes the latent space into distinct performance regions
- Chemical reasoning in the latent space: Uncovering chemical patterns under extreme low-data regimes
- Architecture Agnostic: Works with various LLM architectures (encoder, decoder, encoder-decoder)
- Domain Agnostic: No requirement for domain-specialized models or pretraining
🚀 Quickstart
📦 Project Dependencies
You can install the environment from a file:
```bash
Recommended (Conda)
conda env create -f environment.yaml conda activate gollum
OR (pip-only)
pip install -r requirements.txt ```
For manual setup or more details, see docs/DEPENDENCIES.md.
🛠 Install GOLLuM in editable mode
bash
pip install -e .
⚙️ Running Experiments
All configuration files for reproducing experiments are included in the configs/ directory. You can launch an experiment with:
bash
python train.py --config=configs/pllm_phi.yaml
Replace pllm_phi.yaml with other config files for variants such as llm_phi.yaml, pllm.yaml, etc.
📚 Citation
bibtex
@inproceedings{
rankovic2025gollum,
title={{GOLL}uM: Gaussian Process Optimized {LLM}s {\textemdash} Reframing {LLM} Finetuning through Bayesian Optimization},
author={Bojana Rankovi{\'c} and Philippe Schwaller},
booktitle={ICLR 2025 Workshop on World Models: Understanding, Modelling and Scaling},
year={2025},
url={https://openreview.net/forum?id=2ORViHAUbf}
}
⚖️ License
This project is licensed under the Apache 2.0 License. See the LICENSE file for details.
🤝 Acknowledgements
This work was supported by NCCR Catalysis (grant number 225147), a National Centre of Competence in Research funded by the Swiss National Science Foundation.
Owner
- Name: schwallergroup
- Login: schwallergroup
- Kind: organization
- Repositories: 1
- Profile: https://github.com/schwallergroup
Citation (CITATION.cff)
cff-version: 1.0.2 message: "If you use this software, please cite it as below." title: "gollum" authors: - name: "Bojana Rankovic" version: 0.0.1 doi: url: "https://github.com/schwallergroup/gollum"
GitHub Events
Total
- Watch event: 14
- Push event: 2
- Public event: 1
Last Year
- Watch event: 14
- Push event: 2
- Public event: 1
Dependencies
- actions/checkout v2 composite
- actions/setup-python v2 composite
- codecov/codecov-action v1 composite
- InstructorEmbedding *
- botorch *
- drfp *
- gpytorch *
- jsonargparse *
- matplotlib *
- openai *
- pandas *
- peft *
- pytorch-lightning *
- rdkit *
- rxnfp *
- scikit-learn *
- scikit-learn-extra *
- sentence-transformers *
- torch *
- tqdm *
- transformers *
- wandb *
- numpy
- pandas
- pip
- python 3.10.*
- pytorch-lightning
- scikit-learn
- scikit-learn-extra
- scipy
- tqdm