https://github.com/instadeepai/jumanji

🕹️ A diverse suite of scalable reinforcement learning environments in JAX

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.4%) to scientific vocabulary

Keywords

jax python reinforcement-learning research

Keywords from Contributors

gym hyperparameter-optimization hyperparameter-search rl transformers cryptocurrencies large-language-models diffusion vlms

Last synced: 9 months ago · JSON representation

Repository

🕹️ A diverse suite of scalable reinforcement learning environments in JAX

Basic Info

Host: GitHub
Owner: instadeepai
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://instadeepai.github.io/jumanji
Size: 69.5 MB

Statistics

Stars: 731
Watchers: 13
Forks: 92
Open Issues: 27
Releases: 14

Topics

jax python reinforcement-learning research

Created almost 4 years ago · Last pushed about 1 year ago

Metadata Files

Readme Contributing License

README.md

Environments | Installation | Quickstart | Training | Citation

Jumanji @ ICLR 2024

Jumanji has been accepted at ICLR 2024, check out our research paper.

Welcome to the Jungle! 🌴

Jumanji is a diverse suite of scalable reinforcement learning environments written in JAX. It now features 22 environments!

Jumanji is helping pioneer a new wave of hardware-accelerated research and development in the field of RL. Jumanji's high-speed environments enable faster iteration and large-scale experimentation while simultaneously reducing complexity. Originating in the research team at InstaDeep, Jumanji is now developed jointly with the open-source community. To join us in these efforts, reach out, raise issues and read our contribution guidelines or just star 🌟 to stay up to date with the latest developments!

Goals 🚀

Provide a simple, well-tested API for JAX-based environments.
Make research in RL more accessible.
Facilitate the research on RL for problems in the industry and help close the gap between research and industrial applications.
Provide environments whose difficulty can be scaled to be arbitrarily hard.

Overview 🦜

🥑 Environment API: core abstractions for JAX-based environments.
🕹️ Environment Suite: a collection of RL environments ranging from simple games to NP-hard combinatorial problems.
🍬 Wrappers: easily connect to your favourite RL frameworks and libraries such as Acme, Stable Baselines3, RLlib, Gymnasium and DeepMind-Env through our dm_env and gym wrappers.
🎓 Examples: guides to facilitate Jumanji's adoption and highlight the added value of JAX-based environments.
🏎️ Training: example agents that can be used as inspiration for the agents one may implement in their research.

Environments 🌍

Jumanji provides a diverse range of environments ranging from simple games to NP-hard combinatorial problems.

| Environment | Category | Registered Version(s) | Source | Description | |------------------------------------------|----------|------------------------------------------------------|--------------------------------------------------------------------------------------------------|------------------------------------------------------------------------| | 🔢 Game2048 | Logic | Game2048-v1 | code | doc | | 🎨 GraphColoring | Logic | GraphColoring-v1 | code | doc | | 💣 Minesweeper | Logic | Minesweeper-v0 | code | doc | | 🎲 RubiksCube | Logic | RubiksCube-v0
RubiksCube-partly-scrambled-v0 | code | doc | | 🔀 SlidingTilePuzzle | Logic | SlidingTilePuzzle-v0 | code | doc | | ✏️ Sudoku | Logic | Sudoku-v0
Sudoku-very-easy-v0| code | doc | | 📦 BinPack (3D BinPacking Problem) | Packing | BinPack-v1 | code | doc | | 🧩 FlatPack (2D Grid Filling Problem) | Packing | FlatPack-v0 | code | doc | | 🏭 JobShop (Job Shop Scheduling Problem) | Packing | JobShop-v0 | code | doc | | 🎒 Knapsack | Packing | Knapsack-v1 | code | doc | | ▒ Tetris | Packing | Tetris-v0 | code | doc | | 🧹 Cleaner | Routing | Cleaner-v0 | code | doc | | :link: Connector | Routing | Connector-v2 | code | doc | | 🚚 CVRP (Capacitated Vehicle Routing Problem) | Routing | CVRP-v1 | code | doc | | 🚚 MultiCVRP (Multi-Agent Capacitated Vehicle Routing Problem) | Routing | MultiCVRP-v0 | code | doc | | :mag: Maze | Routing | Maze-v0 | code | doc | | :robot: RobotWarehouse | Routing | RobotWarehouse-v0 | code | doc | | 🐍 Snake | Routing | Snake-v1 | code | doc | | 📬 TSP (Travelling Salesman Problem) | Routing | TSP-v1 | code | doc | | Multi Minimum Spanning Tree Problem | Routing | MMST-v0 | code | doc | | ᗧ•••ᗣ•• PacMan | Routing | PacMan-v1 | code | doc | 👾 Sokoban | Routing | Sokoban-v0 | code | doc | | 🍎 Level-Based Foraging | Routing | LevelBasedForaging-v0 | code | doc | | 🚁 Search and Rescue | Swarms | SearchAndRescue-v0 | code | doc |

Installation 🎬

You can install the latest release of Jumanji from PyPI:

bash pip install -U jumanji

Alternatively, you can install the latest development version directly from GitHub:

bash pip install git+https://github.com/instadeepai/jumanji.git

Jumanji has been tested on Python 3.10, 3.11 and 3.12. Note that because the installation of JAX differs depending on your hardware accelerator, we advise users to explicitly install the correct JAX version (see the official installation guide).

Rendering: Matplotlib is used for rendering all the environments. To visualize the environments you will need a GUI backend. For example, on Linux, you can install Tk via: apt-get install python3-tk, or using conda: conda install tk. Check out Matplotlib backends for a list of backends you can use.

Quickstart ⚡

RL practitioners will find Jumanji's interface familiar as it combines the widely adopted OpenAI Gym and DeepMind Environment interfaces. From OpenAI Gym, we adopted the idea of a registry and the render method, while our TimeStep structure is inspired by DeepMind Environment.

Basic Usage 🧑‍💻

```python import jax import jumanji

Instantiate a Jumanji environment using the registry

env = jumanji.make('Snake-v1')

Reset your (jit-able) environment

key = jax.random.PRNGKey(0) state, timestep = jax.jit(env.reset)(key)

(Optional) Render the env state

env.render(state)

Interact with the (jit-able) environment

action = env.actionspec.generatevalue() # Action selection (dummy value here) state, timestep = jax.jit(env.step)(state, action) # Take a step and observe the next state and time step ```

state represents the internal state of the environment: it contains all the information required to take a step when executing an action. This should not be confused with the observation contained in the timestep, which is the information perceived by the agent.
timestep is a dataclass containing step_type, reward, discount, observation and extras. This structure is similar to dm_env.TimeStep except for the extras field that was added to allow users to log environments metrics that are neither part of the agent's observation nor part of the environment's internal state.

Advanced Usage 🧑‍🔬

Being written in JAX, Jumanji's environments benefit from many of its features including automatic vectorization/parallelization (jax.vmap, jax.pmap) and JIT-compilation (jax.jit), which can be composed arbitrarily. We provide an example of a more advanced usage in the advanced usage guide.

Registry and Versioning 📖

Like OpenAI Gym, Jumanji keeps a strict versioning of its environments for reproducibility reasons. We maintain a registry of standard environments with their configuration. For each environment, a version suffix is appended, e.g. Snake-v1. When changes are made to environments that might impact learning results, the version number is incremented by one to prevent potential confusion. For a full list of registered versions of each environment, check out the documentation.

Training 🏎️

To showcase how to train RL agents on Jumanji environments, we provide a random agent and a vanilla actor-critic (A2C) agent. These agents can be found in jumanji/training/.

Because the environment framework in Jumanji is so flexible, it allows pretty much any problem to be implemented as a Jumanji environment, giving rise to very diverse observations. For this reason, environment-specific networks are required to capture the symmetries of each environment. Alongside the A2C agent implementation, we provide examples of such environment-specific actor-critic networks in jumanji/training/networks.

⚠️ The example agents in jumanji/training are only meant to serve as inspiration for how one can implement an agent. Jumanji is first and foremost a library of environments - as such, the agents and networks will not be maintained to a production standard.

For more information on how to use the example agents, see the training guide.

Contributing 🤝

Contributions are welcome! See our issue tracker for good first issues. Please read our contributing guidelines for details on how to submit pull requests, our Contributor License Agreement, and community guidelines.

Citing Jumanji ✏️

If you use Jumanji in your work, please cite the library using:

@misc{bonnet2024jumanji, title={Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX}, author={Clément Bonnet and Daniel Luo and Donal Byrne and Shikha Surana and Sasha Abramowitz and Paul Duckworth and Vincent Coyette and Laurence I. Midgley and Elshadai Tegegn and Tristan Kalloniatis and Omayma Mahjoub and Matthew Macfarlane and Andries P. Smit and Nathan Grinsztajn and Raphael Boige and Cemlyn N. Waters and Mohamed A. Mimouni and Ulrich A. Mbou Sob and Ruan de Kock and Siddarth Singh and Daniel Furelos-Blanco and Victor Le and Arnu Pretorius and Alexandre Laterre}, year={2024}, eprint={2306.09884}, url={https://arxiv.org/abs/2306.09884}, archivePrefix={arXiv}, primaryClass={cs.LG} }

Acknowledgements 🙏

The development of this library was supported with Cloud TPUs from Google's TPU Research Cloud (TRC) 🌤.

Owner

Name: InstaDeep Ltd
Login: instadeepai
Kind: organization
Email: hello@instadeep.com
Location: London, UK

Website: https://instadeep.com
Twitter: instadeepai
Repositories: 14
Profile: https://github.com/instadeepai

We productise innovation

GitHub Events

Total

Create event: 13
Release event: 1
Issues event: 37
Watch event: 146
Delete event: 10
Issue comment event: 172
Push event: 42
Pull request event: 42
Pull request review comment event: 144
Pull request review event: 141
Fork event: 22

Last Year

Create event: 13
Release event: 1
Issues event: 37
Watch event: 146
Delete event: 10
Issue comment event: 172
Push event: 42
Pull request event: 42
Pull request review comment event: 144
Pull request review event: 141
Fork event: 22

Committers

Last synced: over 1 year ago

All Time

Total Commits: 135
Total Committers: 37
Avg Commits per committer: 3.649
Development Distribution Score (DDS): 0.652

Past Year

Commits: 31
Committers: 11
Avg Commits per committer: 2.818
Development Distribution Score (DDS): 0.613

Top Committers

Name	Email	Commits
Clément Bonnet	5****t	47
Sasha Abramowitz	r**a@g**m	16
Daniel	5****6	13
Alex Laterre	a****e	8
surana01	1****1	4
aar65537	1****7	4
George Ogden	3****n	3
Vincent Coyette	9****v	3
Wiem Khlifi	w**i@i**m	3
zombie-einstein	1****n	2
Tristan Kalloniatis	t**s@g**m	2
Raphaël Boige	4****b	2
Elshadai Tegegn	5****K	2
Callum Tilbury	3****y	2
Arnu Pretorius	a**s@g**m	2
Cemlyn	4****7	1
Cyprien	c**c@g**m	1
Daniel Palenicek	d****n	1
David Tao	r**o@g**m	1
siddarthsingh1	9****1	1
mvmacfarlane	7****e	1
helpingstar	i**r@g**m	1
dantp	1****i	1
Ulrich A. Mbou Sob	m**l@g**m	1
Thomas Hirtz	h**t@g**m	1
S Ashwin	a**1@g**m	1
RuanJohn	3****n	1
Rodrigue Siry	r**y@g**m	1
Raphael Avalos	r**l@a**r	1
RJ Wang	1****j	1
and 7 more...

Committer Domains (Top 20 + Academic)

instadeep.com: 2 avalos.fr: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 104
Total pull requests: 147
Average time to close issues: 4 months
Average time to close pull requests: about 1 month
Total issue authors: 39
Total pull request authors: 39
Average comments per issue: 1.2
Average comments per pull request: 2.2
Merged pull requests: 118
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 16
Pull requests: 25
Average time to close issues: about 1 month
Average time to close pull requests: 19 days
Issue authors: 9
Pull request authors: 6
Average comments per issue: 2.75
Average comments per pull request: 7.84
Merged pull requests: 19
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

dluo96 (26)
clement-bonnet (17)
zombie-einstein (5)
sash-a (4)
George-Ogden (3)
coyettev (3)
cemlyn007 (3)
thomashirtz (2)
carlosgmartin (2)
Wendyuf (2)
RuanJohn (2)
aar65537 (2)
arnupretorius (2)
Egiob (1)
sotetsuk (1)

Pull Request Authors

clement-bonnet (52)
sash-a (29)
zombie-einstein (12)
dluo96 (12)
aar65537 (6)
George-Ogden (6)
WiemKhlifi (6)
callumtilbury (6)
coyettev (5)
RuanJohn (3)
chouakifares (2)
raphaelavalos (2)
thomashirtz (2)
taodav (2)
surana01 (2)

Top Labels

Issue Labels

enhancement (52) bug (27) good first issue (14) documentation (13) ci (4) help wanted (4) duplicate (1) question (1)

Pull Request Labels

enhancement (20) bug (19) documentation (12) ci (12) good first issue (1)

Packages

Total packages: 1
Total downloads: unknown

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 18

proxy.golang.org: github.com/instadeepai/jumanji

Documentation: https://pkg.go.dev/github.com/instadeepai/jumanji#section-documentation
License: apache-2.0
Latest release: v1.1.1
published about 1 year ago

Versions: 18
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 6.5%

Average: 6.7%

Dependent repos count: 7.0%

Last synced: 10 months ago

Dependencies

requirements/requirements-dev.txt pypi

black ==22.3.0 development
coverage * development
dm-haiku ==0.0.5 development
flake8 ==4.0.1 development
hydra-core * development
isort ==5.10.1 development
livereload * development
mkdocs ==1.2.3 development
mkdocs-git-revision-date-plugin * development
mkdocs-include-markdown-plugin * development
mkdocs-material * development
mkdocs-mermaid2-plugin ==0.6.0 development
mkdocstrings ==0.18.0 development
mknotebooks ==0.7.1 development
mypy ==0.942 development
nbmake * development
optax >=0.0.9 development
pre-commit ==2.17.0 development
promise * development
pymdown-extensions * development
pytest ==7.0.1 development
pytest-cov * development
pytest-mock * development
pytest-parallel * development
pytest-xdist * development
pytype * development
testfixtures * development

requirements/requirements.txt pypi

Pillow >=9.0.0
brax >=0.0.10
chex >=0.1.3
dm-env >=1.5
gym >=0.19.0
jax >=0.2.26
jaxlib >=0.1.74
matplotlib >=3.3.4
numpy >=1.19.5
pygame >=2.0.0
typing-extensions >=4.0.0

.github/workflows/docs_deploy.yml actions

JamesIves/github-pages-deploy-action v4 composite
actions/checkout v3 composite

.github/workflows/release.yml actions

actions/checkout v2 composite
actions/setup-python v1 composite

.github/workflows/tests_linters.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

pyproject.toml pypi

requirements/requirements-train.txt pypi

dm-haiku ==0.0.9
huggingface-hub *
hydra-core ==1.3
neptune-client ==0.16.15
optax >=0.1.4
rlax >=0.1.4
tensorboardX ==2.5.1
tqdm >=4.64.1

setup.py pypi

https://github.com/instadeepai/jumanji

Science Score: 36.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

| Docs

Jumanji @ ICLR 2024

Welcome to the Jungle! 🌴

Goals 🚀

Overview 🦜

Environments 🌍

Installation 🎬

Quickstart ⚡

Basic Usage 🧑‍💻

Instantiate a Jumanji environment using the registry

Reset your (jit-able) environment

(Optional) Render the env state

Interact with the (jit-able) environment

Advanced Usage 🧑‍🔬

Registry and Versioning 📖

Training 🏎️

Contributing 🤝

Citing Jumanji ✏️

See Also 🔎

Acknowledgements 🙏

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

proxy.golang.org: github.com/instadeepai/jumanji

Rankings

Dependencies