turbozero

fast + parallel AlphaZero in JAX

https://github.com/lowrollr/turbozero

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.4%) to scientific vocabulary

Keywords

alphazero gpu-acceleration jax mcts monte-carlo-tree-search reinforcement-learning vectorization
Last synced: 6 months ago · JSON representation ·

Repository

fast + parallel AlphaZero in JAX

Basic Info
  • Host: GitHub
  • Owner: lowrollr
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 28.8 MB
Statistics
  • Stars: 94
  • Watchers: 1
  • Forks: 9
  • Open Issues: 1
  • Releases: 0
Topics
alphazero gpu-acceleration jax mcts monte-carlo-tree-search reinforcement-learning vectorization
Created almost 3 years ago · Last pushed about 1 year ago
Metadata Files
Readme Funding License Citation

README.md

turbozero 🏎️ 🏎️ 🏎️ 🏎️

📣 If you're looking for the old PyTorch version of turbozero, it's been moved here: turbozero_torch 📣

turbozero is a vectorized implementation of AlphaZero written in JAX

It contains: * Monte Carlo Tree Search with subtree persistence * Batched Replay Memory * A complete, customizable training/evaluation loop

turbozero is fast and parallelized:

  • every consequential part of the training loop is JIT-compiled
  • parititions across multiple GPUs by default when available 🚀 NEW! 🚀
  • self-play and evaluation episodes are batched/vmapped with hardware-acceleration in mind

turbozero is extendable:

turbozero is flexible:

  • easy to integrate with you custom JAX environment or neural network architecture.
  • Use the provided training and evaluation utilities, or pick and choose the components that you need.

To get started, check out the Hello World Notebook

Installation

turbozero uses poetry for dependency management, you can install it with: pip install poetry==1.7.1 Then, to install dependencies: poetry install If you're using a GPU/TPU/etc., after running the previous command you'll need to install the device-specific version of JAX.

For a GPU w/ CUDA 12: poetry source add jax https://storage.googleapis.com/jax-releases/jax_cuda_releases.html to point poetry towards JAX cuda releases, then use poetry add jax[cuda12_pip]==0.4.35 to install the CUDA 12 release for JAX. See https://jax.readthedocs.io/en/latest/installation.html for other devices/cuda versions.

I have tested this project with CUDA 11 and CUDA 12.

To launch an ipython kernel, run: poetry run python -m ipykernel install --user --name turbozero

Issues

If you use this project and encounter an issue, error, or undesired behavior, please submit a GitHub Issue and I will do my best to resolve it as soon as I can. You may also contact me directly via hello@jacob.land.

Contributing

Contributions, improvements, and fixes are more than welcome! For now I don't have a formal process for this, other than creating a Pull Request. For large changes, consider creating an Issue beforehand.

If you are interested in contributing but don't know what to work on, please reach out. I have plenty of things you could do.

References

Papers/Repos I found helpful.

Repositories: * google-deepmind/mctx: Monte Carlo tree search in JAX * sotetsuk/pgx: Vectorized RL game environments in JAX * instadeepai/flashbax: Accelerated Replay Buffers in JAX * google-deepmind/open_spiel: RL algorithms

Papers: * Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm * Revisiting Fundamentals of Experience Replay

Cite This Work

If you found this work useful, please cite it with: @software{turbozero, author = {Marshall, Jacob}, title = {{turbozero: fast + parallel AlphaZero}}, url = {https://github.com/lowrollr/turbozero} }

Owner

  • Name: Jacob Marshall
  • Login: lowrollr
  • Kind: user
  • Location: San Francisco
  • Company: Unaffiliated

Citation (CITATION.cff)

cff-version: 1.2.0
title: "turbozero: fast + parallel AlphaZero"
abstract: vectorized implementation of AlphaZero/MCTS with training and evaluation utilities
url: "https://github.com/lowrollr/turbozero"
authors:
- family-names: "Marshall" 
  given-names: "Jacob"

GitHub Events

Total
  • Issues event: 5
  • Watch event: 15
  • Issue comment event: 7
  • Push event: 5
  • Pull request event: 7
  • Fork event: 4
  • Create event: 2
Last Year
  • Issues event: 5
  • Watch event: 15
  • Issue comment event: 7
  • Push event: 5
  • Pull request event: 7
  • Fork event: 4
  • Create event: 2

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 500
  • Total Committers: 3
  • Avg Commits per committer: 166.667
  • Development Distribution Score (DDS): 0.036
Past Year
  • Commits: 500
  • Committers: 3
  • Avg Commits per committer: 166.667
  • Development Distribution Score (DDS): 0.036
Top Committers
Name Email Commits
lowrollr 9****r 482
Jacob Marshall l****v@g****m 17
Davide Angioni d****i@o****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: about 2 years ago

All Time
  • Total issues: 3
  • Total pull requests: 2
  • Average time to close issues: about 12 hours
  • Average time to close pull requests: about 2 hours
  • Total issue authors: 2
  • Total pull request authors: 2
  • Average comments per issue: 3.67
  • Average comments per pull request: 0.5
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 3
  • Pull requests: 2
  • Average time to close issues: about 12 hours
  • Average time to close pull requests: about 2 hours
  • Issue authors: 2
  • Pull request authors: 2
  • Average comments per issue: 3.67
  • Average comments per pull request: 0.5
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • Nightbringers (2)
  • bubble-07 (2)
  • lowrollr (2)
  • CDM1619 (1)
  • DavideTr8 (1)
  • fohmij (1)
  • Sword-Nomad (1)
  • DuaneNielsen (1)
  • ConstantinRuhdorfer (1)
  • r-wedeen (1)
Pull Request Authors
  • lowrollr (6)
  • xvalcarce (2)
  • DavideTr8 (1)
Top Labels
Issue Labels
enhancement (2) bug (1) good first issue (1)
Pull Request Labels

Dependencies

poetry.lock pypi
  • appnope 0.1.3
  • asttokens 2.2.1
  • backcall 0.2.0
  • bottleneck 1.3.7
  • cffi 1.15.1
  • colorama 0.4.6
  • comm 0.1.4
  • contourpy 1.1.0
  • cycler 0.11.0
  • debugpy 1.6.7.post1
  • decorator 5.1.1
  • executing 1.2.0
  • filelock 3.12.2
  • fonttools 4.42.0
  • importlib-metadata 6.8.0
  • importlib-resources 6.0.1
  • ipykernel 6.25.1
  • ipython 8.14.0
  • jedi 0.19.0
  • jinja2 3.1.2
  • jupyter-client 8.3.0
  • jupyter-core 5.3.1
  • kiwisolver 1.4.4
  • markupsafe 2.1.3
  • matplotlib 3.7.2
  • matplotlib-inline 0.1.6
  • mpmath 1.3.0
  • nest-asyncio 1.5.7
  • networkx 3.1
  • numpy 1.25.2
  • packaging 23.1
  • parso 0.8.3
  • pexpect 4.8.0
  • pickleshare 0.7.5
  • pillow 10.0.0
  • platformdirs 3.10.0
  • prompt-toolkit 3.0.39
  • psutil 5.9.5
  • ptyprocess 0.7.0
  • pure-eval 0.2.2
  • pycparser 2.21
  • pygments 2.16.1
  • pyparsing 3.0.9
  • python-dateutil 2.8.2
  • pywin32 306
  • pyyaml 6.0.1
  • pyzmq 25.1.1
  • six 1.16.0
  • stack-data 0.6.2
  • sympy 1.12
  • torch 2.0.1
  • tornado 6.3.3
  • traitlets 5.9.0
  • typing-extensions 4.7.1
  • wcwidth 0.2.6
  • zipp 3.16.2
pyproject.toml pypi
  • Bottleneck ^1.3.7
  • colorama ^0.4.6
  • ipykernel ^6.25.1
  • ipython ^8.14.0
  • matplotlib ^3.7.2
  • numpy ^1.25.2
  • python ^3.9
  • pyyaml ^6.0.1
  • torch ^2.0.1