forecastbench
A dynamic forecasting benchmark for LLMs
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.5%) to scientific vocabulary
Repository
A dynamic forecasting benchmark for LLMs
Basic Info
- Host: GitHub
- Owner: forecastingresearch
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://www.forecastbench.org
- Size: 1.62 MB
Statistics
- Stars: 27
- Watchers: 5
- Forks: 3
- Open Issues: 31
- Releases: 0
Metadata Files
README.md
ForecastBench
A dynamic, continuously-updated benchmark to evaluate LLM forecasting capabilities. More at www.forecastbench.org.
Datasets
Leaderboards and datasets are updated nightly and available at github.com/forecastingresearch/forecastbench-datasets.
Participate in the benchmark
Instructions for how to submit your model to the benchmark can be found here: How-to-submit-to-ForecastBench.
Wiki
Dig into the details of ForecastBench on the wiki.
Citation
bibtex
@inproceedings{karger2025forecastbench,
title={ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities},
author={Ezra Karger and Houtan Bastani and Chen Yueh-Han and Zachary Jacobs and Danny Halawi and Fred Zhang and Philip E. Tetlock},
year={2025},
booktitle={International Conference on Learning Representations (ICLR)},
url={https://iclr.cc/virtual/2025/poster/28507}
}
Getting started for devs
Local setup
git clone --recurse-submodules <repo-url>.gitcd forecastbenchcp variables.example.mk variables.mkand set the values accordingly- Setup your Python virtual environment
make setup-python-envsource .venv/bin/activate
Run GCP Cloud Functions locally
cd directory/containing/cloud/functioneval $(cat path/to/variables.mk | xargs) python main.py
Contributions
Before creating a pull request:
* run make lint and fix any errors and warnings
* ensure code has been deployed to Google Cloud Platform and tested (only for our devs, for others,
we're happy you're contributing and we'll test this on our end).
* fork the repo
* reference the issue number (if one exists) in the commit message
* push to the fork on a branch other than main
* create a pull request
Owner
- Name: forecastingresearch
- Login: forecastingresearch
- Kind: organization
- Repositories: 1
- Profile: https://github.com/forecastingresearch
GitHub Events
Total
- Issues event: 27
- Watch event: 21
- Delete event: 2
- Issue comment event: 7
- Push event: 80
- Pull request event: 4
- Gollum event: 84
- Fork event: 2
- Create event: 3
Last Year
- Issues event: 27
- Watch event: 21
- Delete event: 2
- Issue comment event: 7
- Push event: 80
- Pull request event: 4
- Gollum event: 84
- Fork event: 2
- Create event: 3
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 15
- Total pull requests: 2
- Average time to close issues: 3 months
- Average time to close pull requests: about 5 hours
- Total issue authors: 2
- Total pull request authors: 1
- Average comments per issue: 0.13
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 15
- Pull requests: 2
- Average time to close issues: 3 months
- Average time to close pull requests: about 5 hours
- Issue authors: 2
- Pull request authors: 1
- Average comments per issue: 0.13
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- houtanb (25)
- connachermurphy (1)
Pull Request Authors
- zjac0bs (2)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v3 composite
- actions/setup-python v3 composite
- black >=24.4.0
- flake8 >=7.0.0
- flake8-bugbear >=24.2.6
- isort >=5.13.2
- pydocstyle >=6.3.0
- gcsfs ==2024.3.1
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- scipy *
- termcolor *
- tqdm *
- anthropic *
- functions-framework ==3.
- google-cloud-secret-manager *
- google-cloud-storage *
- google-generativeai *
- mistralai ==0.4.2
- openai *
- pandas *
- together *
- anthropic *
- google-cloud-secret-manager *
- google-cloud-storage *
- google-generativeai *
- mistralai ==0.4.2
- openai *
- pandas *
- together *
- google-cloud-run *
- google-cloud-secret-manager *
- slack_sdk *
- google-cloud-run *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas >=2.2.2
- pytz *
- slack_sdk *
- tqdm *
- backoff *
- certifi *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- requests *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- tqdm *
- backoff *
- certifi *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- tqdm *
- backoff *
- bs4 *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas >=2.2.2
- requests *
- tqdm *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas >=2.2.2
- backoff *
- certifi *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- requests *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas >=2.2.2
- backoff *
- certifi *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- pyarrow *
- backoff *
- certifi *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- pyarrow *
- backoff *
- certifi *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- pyarrow *
- backoff *
- certifi *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- pyarrow *
- backoff *
- certifi *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- py_clob_client *
- requests *
- tqdm *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- beautifulsoup4 *
- google-cloud-secret-manager *
- google-cloud-storage *
- lxml *
- pandas *
- tqdm *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- tqdm *
- bs4 *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas >=2.2.2
- requests *
- tqdm *
- yfinance *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas >=2.2.2
- yfinance *
- gcsfs ==2024.3.1
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- tqdm *
- google-cloud-storage *
- pandas *
- tqdm *
- GitPython *
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- tqdm *
- GitPython *
- anthropic *
- backoff *
- certifi *
- google-api-python-client *
- google-cloud-aiplatform *
- google-cloud-artifact-registry *
- google-cloud-container *
- google-cloud-run *
- google-cloud-secret-manager *
- google-cloud-storage *
- google-generativeai *
- ipython *
- jupyter *
- matplotlib *
- mistralai ==0.4.2
- openai *
- pandas *
- papermill *
- pydantic >=2.5.2,<3.0.0
- requests *
- slack_sdk *
- together *
- GitPython *
- anthropic *
- backoff *
- certifi *
- google-api-python-client *
- google-cloud-aiplatform *
- google-cloud-artifact-registry *
- google-cloud-container *
- google-cloud-run *
- google-cloud-secret-manager *
- google-cloud-storage *
- google-genai >=1.0.0
- ipython *
- jupyter *
- matplotlib *
- mistralai ==0.4.2
- openai *
- pandas *
- papermill *
- pydantic >=2.5.2,<3.0.0
- requests *
- slack_sdk *
- together *
- GitPython *
- gcsfs ==2024.3.1
- google-cloud-secret-manager *
- google-cloud-storage *
- pandas *
- tqdm *
- GitPython *
- google-cloud-secret-manager *
- google-cloud-storage *
- python-dateutil *
- pytz *