https://github.com/alignmentresearch/pgx

A collection of highly-parallel RL game environments written in JAX

https://github.com/alignmentresearch/pgx

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.7%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

A collection of highly-parallel RL game environments written in JAX

Basic Info
  • Host: GitHub
  • Owner: AlignmentResearch
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 59.2 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of sotetsuk/pgx
Created about 3 years ago · Last pushed about 3 years ago

https://github.com/AlignmentResearch/pgx/blob/main/

[![ci](https://github.com/sotetsuk/pgx/actions/workflows/ci.yml/badge.svg)](https://github.com/sotetsuk/pgx/actions/workflows/ci.yml)

A collection of GPU/TPU-accelerated parallel game simulators for reinforcement learning (RL)
## Why Pgx? [Brax](https://github.com/google/brax), a [JAX](https://github.com/google/jax)-native physics engine, provides extremely high-speed parallel simulation for RL in *continuous* state space. Then, what about RL in *discrete* state spaces like Chess, Shogi, and Go? **Pgx** provides a wide variety of JAX-native game simulators! Highlighted features include: - **JAX-native.** All `step` functions are *JIT-able* - **Super fast** in parallel execution on accelerators - **Various game support** including **Backgammon**, **Chess**, **Shogi**, and **Go** - **Beautiful visualization** in SVG format ## Install ```sh pip install pgx ``` ## Usage Open In Colab ```py import jax import pgx env = pgx.make("go_19x19") init = jax.jit(jax.vmap(env.init)) # vectorize and JIT-compile step = jax.jit(jax.vmap(env.step)) batch_size = 1024 keys = jax.random.split(jax.random.PRNGKey(42), batch_size) state = init(keys) # vectorized states while not state.terminated.all(): action = model(state.current_player, state.observation, state.legal_action_mask) state = step(state, action) # state.reward (2,) ``` ## Supported games and road map > :warning: Pgx is currently in the beta version. Therefore, API is subject to change without notice. We aim to release v1.0.0 in April 2023. Opinions and comments are more than welcome! Use `pgx.available_games()` to see the list of currently available games.
Game Environment Visualization
2048 :white_check_mark: :white_check_mark:
Animal Shogi :white_check_mark: :white_check_mark:
Backgammon :white_check_mark: :white_check_mark:
Bridge Bidding :construction: :white_check_mark:
Chess :white_check_mark: :white_check_mark:
Connect Four :white_check_mark: :white_check_mark:
Go :white_check_mark: :white_check_mark:
Hex :white_check_mark: :white_check_mark:
Kuhn Poker :white_check_mark: :white_check_mark:
Leduc hold'em :white_check_mark: :white_check_mark:
Mahjong :construction: :construction:
MinAtar/Asterix :white_check_mark: :white_check_mark:
MinAtar/Breakout :white_check_mark: :white_check_mark:
MinAtar/Freeway :white_check_mark: :white_check_mark:
MinAtar/Seaquest :white_check_mark: :white_check_mark:
MinAtar/SpaceInvaders :white_check_mark: :white_check_mark:
Othello :white_check_mark: :white_check_mark:
Shogi :white_check_mark: :white_check_mark:
Sparrow Mahjong :white_check_mark: :white_check_mark:
Tic-tac-toe :white_check_mark: :white_check_mark:
## See also Pgx is intended to complement these **JAX-native environments** with (classic) board game suits: - [RobertTLange/gymnax](https://github.com/RobertTLange/gymnax): JAX implementation of popular RL environments ([classic control](https://gymnasium.farama.org/environments/classic_control), [bsuite](https://github.com/deepmind/bsuite), MinAtar, etc) and meta RL tasks - [google/brax](https://github.com/google/brax): Rigidbody physics simulation in JAX and continuous-space RL tasks (ant, fetch, humanoid, etc) - [instadeepai/jumanji](https://github.com/instadeepai/jumanji): A suite of diverse and challenging RL environments in JAX (bin-packing, routing problems, etc) Combining Pgx with these **JAX-native algorithms/implementations** might be an interesting direction: - [Anakin framework](https://arxiv.org/abs/2104.06272): Highly efficient RL framework that works with JAX-native environments on TPUs - [deepmind/mctx](https://github.com/deepmind/mctx): JAX-native MCTS implementations, including AlphaZero and MuZero - [deepmind/rlax](https://github.com/deepmind/rlax): JAX-native RL components - [google/evojax](https://github.com/google/evojax): Hardware-Accelerated neuroevolution - [RobertTLange/evosax](https://github.com/RobertTLange/evosax): JAX-native evolution strategy (ES) implementations - [adaptive-intelligent-robotics/QDax](https://github.com/adaptive-intelligent-robotics/QDax): JAX-native Quality-Diversity (QD) algorithms ## Citation ``` @article{koyamada2023pgx, title={Pgx: Hardware-accelerated parallel game simulation for reinforcement learning}, author={Koyamada, Sotetsu and Okano, Shinri and Nishimori, Soichiro and Murata, Yu and Habara, Keigo and Kita, Haruka and Ishii, Shin}, journal={arXiv preprint arXiv:2303.17503}, year={2023} } ``` ## LICENSE Apache-2.0

Owner

  • Name: FAR AI
  • Login: AlignmentResearch
  • Kind: organization
  • Email: hello@far.ai

FAR AI is an alignment research non-profit working to ensure AI systems are trustworthy and beneficial to society.

GitHub Events

Total
Last Year