safety-gymnasium

NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

https://github.com/pku-alignment/safety-gymnasium

Keywords

constraint-rl constraint-satisfaction-problem reinforcement-learning safe-policy-optimization safe-reinforcement-learning safe-reinforcement-learning-environments safety-critical safety-critical-systems

Keywords from Contributors

transformers mesh data-profilers datacleaner pipeline-testing exoplanet energy-system cryptocurrencies hydrology spacy-extension

Last synced: 9 months ago · JSON representation

Repository

NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

Basic Info

Host: GitHub
Owner: PKU-Alignment
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://safety-gymnasium.readthedocs.io/en/latest/
Size: 490 MB

Statistics

Stars: 478
Watchers: 8
Forks: 66
Open Issues: 10
Releases: 12

Topics

constraint-rl constraint-satisfaction-problem reinforcement-learning safe-policy-optimization safe-reinforcement-learning safe-reinforcement-learning-environments safety-critical safety-critical-systems

Created over 3 years ago · Last pushed over 1 year ago

Metadata Files

Readme Changelog Contributing License Code of conduct Citation

README.md

Safety-Gymnasium

![Python 3.8+](https://img.shields.io/badge/Python-3.8%2B-brightgreen.svg) ![PyPI](https://img.shields.io/pypi/v/safety-gymnasium?logo=pypi) ![Documentation Status](https://img.shields.io/readthedocs/safety-gymnasium?logo=readthedocs) ![Downloads](https://static.pepy.tech/personalized-badge/safety-gymnasium?period=total&left_color=grey&right_color=blue&left_text=downloads) ![GitHub Repo Stars](https://img.shields.io/github/stars/PKU-Alignment/safety-gymnasium?color=brightgreen&logo=github) ![License](https://img.shields.io/github/license/PKU-Alignment/safety-gymnasium?label=license&logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAyNCAyNCIgd2lkdGg9IjI0IiBoZWlnaHQ9IjI0IiBmaWxsPSIjZmZmZmZmIj48cGF0aCBmaWxsLXJ1bGU9ImV2ZW5vZGQiIGQ9Ik0xMi43NSAyLjc1YS43NS43NSAwIDAwLTEuNSAwVjQuNUg5LjI3NmExLjc1IDEuNzUgMCAwMC0uOTg1LjMwM0w2LjU5NiA1Ljk1N0EuMjUuMjUgMCAwMTYuNDU1IDZIMi4zNTNhLjc1Ljc1IDAgMTAwIDEuNUgzLjkzTC41NjMgMTUuMThhLjc2Mi43NjIgMCAwMC4yMS44OGMuMDguMDY0LjE2MS4xMjUuMzA5LjIyMS4xODYuMTIxLjQ1Mi4yNzguNzkyLjQzMy42OC4zMTEgMS42NjIuNjIgMi44NzYuNjJhNi45MTkgNi45MTkgMCAwMDIuODc2LS42MmMuMzQtLjE1NS42MDYtLjMxMi43OTItLjQzMy4xNS0uMDk3LjIzLS4xNTguMzEtLjIyM2EuNzUuNzUgMCAwMC4yMDktLjg3OEw1LjU2OSA3LjVoLjg4NmMuMzUxIDAgLjY5NC0uMTA2Ljk4NC0uMzAzbDEuNjk2LTEuMTU0QS4yNS4yNSAwIDAxOS4yNzUgNmgxLjk3NXYxNC41SDYuNzYzYS43NS43NSAwIDAwMCAxLjVoMTAuNDc0YS43NS43NSAwIDAwMC0xLjVIMTIuNzVWNmgxLjk3NGMuMDUgMCAuMS4wMTUuMTQuMDQzbDEuNjk3IDEuMTU0Yy4yOS4xOTcuNjMzLjMwMy45ODQuMzAzaC44ODZsLTMuMzY4IDcuNjhhLjc1Ljc1IDAgMDAuMjMuODk2Yy4wMTIuMDA5IDAgMCAuMDAyIDBhMy4xNTQgMy4xNTQgMCAwMC4zMS4yMDZjLjE4NS4xMTIuNDUuMjU2Ljc5LjRhNy4zNDMgNy4zNDMgMCAwMDIuODU1LjU2OCA3LjM0MyA3LjM0MyAwIDAwMi44NTYtLjU2OWMuMzM4LS4xNDMuNjA0LS4yODcuNzktLjM5OWEzLjUgMy41IDAgMDAuMzEtLjIwNi43NS43NSAwIDAwLjIzLS44OTZMMjAuMDcgNy41aDEuNTc4YS43NS43NSAwIDAwMC0xLjVoLTQuMTAyYS4yNS4yNSAwIDAxLS4xNC0uMDQzbC0xLjY5Ny0xLjE1NGExLjc1IDEuNzUgMCAwMC0uOTg0LS4zMDNIMTIuNzVWMi43NXpNMi4xOTMgMTUuMTk4YTUuNDE4IDUuNDE4IDAgMDAyLjU1Ny42MzUgNS40MTggNS40MTggMCAwMDIuNTU3LS42MzVMNC43NSA5LjM2OGwtMi41NTcgNS44M3ptMTQuNTEtLjAyNGMuMDgyLjA0LjE3NC4wODMuMjc1LjEyNi41My4yMjMgMS4zMDUuNDUgMi4yNzIuNDVhNS44NDYgNS44NDYgMCAwMDIuNTQ3LS41NzZMMTkuMjUgOS4zNjdsLTIuNTQ3IDUuODA3eiI+PC9wYXRoPjwvc3ZnPgo=)

Why Safety-Gymnasium? | Documentation | Install guide | Customization | Video

Safety-Gymnasium is a highly scalable and customizable Safe Reinforcement Learning (SafeRL) library. It aims to deliver a good view of benchmarking SafeRL algorithms and a standardized set of environments. We provide a set of standard APIs which are compatible with information on constraints. Users can explore new insights via an elegant code framework and well-designed environments.

Citing Safety-Gymnasium

If you find Safety-Gymnasium useful, please cite it in your publications.

bibtex @inproceedings{ji2023safety, title={Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark}, author={Jiaming Ji and Borong Zhang and Jiayi Zhou and Xuehai Pan and Weidong Huang and Ruiyang Sun and Yiran Geng and Yifan Zhong and Josef Dai and Yaodong Yang}, booktitle={Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track}, year={2023}, url={https://openreview.net/forum?id=WZmlxIuIGR} }

Note for v1.1.0 and v1.2.0

We have updated the environments for both the Safe Vision series and the Safe Isaac Gym series. However, due to package size constraints, we have not yet uploaded versions v1.1.0 and v1.2.0 to PyPI. As a result, users are required to manually download and install. We currently recommend using GitHub's Download zip feature to obtain our package and access the latest environments. In the future, we plan to deploy resources separately to a cloud service to accommodate PyPI. Stay tuned for further updates.

Python 3.11 is not supported for now, due to the incompatibility of pygame. bash conda create -n example python=3.8 conda activate example wget https://github.com/PKU-Alignment/safety-gymnasium/archive/refs/heads/main.zip unzip main.zip cd safety-gymnasium-main pip install -e .

Why Safety-Gymnasium?

Here we provide a table for comparison of Safety-Gymnasium and existing SafeRL Environments libraries.

^{(1): Maintenance (expect bug fixes and minor updates); the last commit is 19 Nov 2021. Safety-Gym depends on mujoco-py 2.0.2.7, which was updated on Oct 12, 2019.}
^{(2): There is no official library for speed-related environments, and its associated cost constraints are constructed from info. But the task is widely used in the study of SafeRL, and we encapsulate it in Safety-Gymnasium.}
^{(3): In the gym 0.26.0 release update, a new API of interaction was redefined.}

Environments

We designed a variety of safety-enhanced learning tasks and integrated the contributions from the RL community: safety-velocity, safety-run, safety-circle, safety-goal, safety-button, etc. We introduce a unified safety-enhanced learning benchmark environment library called Safety-Gymnasium.

Further, to facilitate the progress of community research, we redesigned Safety-Gym and removed the dependency on mujoco-py. We built it on top of MuJoCo and fixed some bugs, more specific bug reports can refer to Safety-Gym's BUG Report.

Here is a list of all the environments we support for now:

Category	Task	Agent	Example
Safe Navigation	Button[012]	Point, Car, Doggo, Racecar, Ant	SafetyPointGoal1-v0
	Goal[012]
	Push[012]
	Circle[012]
Safe Velocity	Velocity	HalfCheetah, Hopper, Swimmer, Walker2d, Ant, Humanoid	SafetyAntVelocity-v1
Safe Vision	BuildingButton[012]	Point, Car, Doggo, Racecar, Ant	SafetyFormulaOne1-v0
	BuildingGoal[012]
	BuildingPush[012]
	FadingEasy[012]
	FadingHard[012]
	Race[012]
	FormulaOne[012]
Safe Multi-Agent	MultiGoal[012]	Multi-Point, Multi-Ant	SafetyAntMultiGoal1-v0
	Multi-Agent Velocity	6x1HalfCheetah, 2x3HalfCheetah, 3x1Hopper, 2x1Swimmer, 2x3Walker2d, 2x4Ant, 4x2Ant, 9\|8Humanoid	Safety2x4AntVelocity-v0
	FreightFrankaCloseDrawer(Multi-Agent)	FreightFranka	FreightFrankaCloseDrawer(Multi-Agent)
	FreightFrankaPickAndPlace(Multi-Agent)	FreightFranka	FreightFrankaCloseDrawer(Multi-Agent)
	ShadowHandCatchOver2UnderarmSafeFinger(Multi-Agent)	ShadowHands	ShadowHandCatchOver2UnderarmSafeJoint(Multi-Agent)
	ShadowHandCatchOver2UnderarmSafeJoint(Multi-Agent)
	ShadowHandOverSafeFinger(Multi-Agent)
	ShadowHandOverSafeJoint(Multi-Agent)
Safe Isaac Gym	FreightFrankaCloseDrawer	FreightFranka	FreightFrankaCloseDrawer
	FreightFrankaPickAndPlace	FreightFranka	FreightFrankaCloseDrawer
	ShadowHandCatchOver2UnderarmSafeFinger	ShadowHands	ShadowHandCatchOver2UnderarmSafeJoint
	ShadowHandCatchOver2UnderarmSafeJoint
	ShadowHandOverSafeFinger
	ShadowHandOverSafeJoint

Here are some screenshots of the Safe Navigation tasks.

Agents

Point

https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/images/car_front.jpeg

Car

Racecar

Doggo

https://github.com/PKU-Alignment/safety-gymnasium/raw/HEAD/images/ant_front.jpeg

Ant

Tasks

Goal0	Goal1	Goal2
Button0	Button1	Button2
Push0	Push1	Push2
Circle0	Circle1	Circle2

Vision-based Safe RL

Vision-based SafeRL lacks realistic scenarios. Although the original Safety-Gym could minimally support visual input, the scenarios were too similar. To facilitate the validation of visual-based SafeRL algorithms, we have developed a set of realistic vision-based SafeRL tasks, which are currently being validated on the baseline.

For the appetizer, the images are as follows:

Race0	Race1	Race2
FormulaOne0	FormulaOne1	FormulaOne2

Environment Usage

Notes: We support explicitly expressing the cost based on Gymnasium APIs. The step method returns 6 items (next_obervation, reward, cost, terminated, truncated, info) with an extra cost field.

```python import safety_gymnasium

envid = 'SafetyPointGoal1-v0' env = safetygymnasium.make(env_id)

obs, info = env.reset() while True: act = env.action_space.sample() obs, reward, cost, terminated, truncated, info = env.step(act) if terminated or truncated: break env.render() ```

We also provide two convenience wrappers for converting the Safety-Gymnasium environment to the standard Gymnasium API and vice versa.

```python

Safety-Gymnasium API: step returns (next_obervation, reward, cost, terminated, truncated, info)

Gymnasium API: step returns (next_obervation, reward, terminated, truncated, info) and cost is in the `info` dict associated with a str key `'cost'`

safetygymnasiumenv = safetygymnasium.make(envid) gymnasiumenv = safetygymnasium.wrappers.SafetyGymnasium2Gymnasium(safetygymnasiumenv)

safetygymnasiumenv = safetygymnasium.wrappers.Gymnasium2SafetyGymnasium(gymnasiumenv) ```

Users can apply Gymnasium wrappers easily with:

```python import gymnasium import safety_gymnasium

def makesafeenv(envid): safeenv = safetygymnasium.make(envid) env = safetygymnasium.wrappers.SafetyGymnasium2Gymnasium(safeenv) env = gymnasium.wrappers.SomeWrapper1(env) env = gymnasium.wrappers.SomeWrapper2(env, argname1=arg1, argname2=arg2) ... env = gymnasium.wrappers.SomeWrapperN(env) safeenv = safetygymnasium.wrappers.Gymnasium2SafetyGymnasium(env) return safe_env ```

or

```python import functools

import gymnasium import safety_gymnasium

def makesafeenv(envid): return safetygymnasium.wrappers.withgymnasiumwrappers( safetygymnasium.make(envid), gymnasium.wrappers.SomeWrapper1, functools.partial(gymnasium.wrappers.SomeWrapper2, argname1=arg1, argname2=arg2), ..., gymnasium.wrappers.SomeWrapperN, ) ```

In addition, for all Safety-Gymnasium environments, we also provide corresponding Gymnasium environments with a suffix Gymnasium in the environment id. For example:

```python import gymnasium import safety_gymnasium

safetygymnasium.make('SafetyPointGoal1-v0') # step returns (nextobervation, reward, cost, terminated, truncated, info) gymnasium.make('SafetyPointGoal1Gymnasium-v0') # step returns (next_obervation, reward, terminated, truncated, info) ```

Installation

Install from PyPI

bash pip install safety-gymnasium

Install from source

```bash conda create -n python=3.8 conda activate

git clone https://github.com/PKU-Alignment/safety-gymnasium.git cd safety-gymnasium pip install -e . ```

Important Notes

If you failed to render on your server, you can try:

bash echo "export MUJOCO_GL=osmesa" >> ~/.bashrc source ~/.bashrc apt-get install libosmesa6-dev apt-get install python3-opengl

Debug with your keyboard

For simple agents, we offer the capability to control the robot's movement via the keyboard, facilitating debugging. Simply append a Debug suffix to the task name, such as SafetyCarGoal2Debug-v0, and utilize the keys I, K, J, and L to guide the robot's movement.

For more intricate agents, you can also craft custom control logic based on specific peripherals. To achieve this, implement the debug method from the BaseAgent for the designated agent.

Customize your environments

We construct a highly expandable framework of code so that you can easily comprehend it and design your environments to facilitate your research with no more than 100 lines of code on average.

For details, please refer to our documentation. Here is a minimal example:

```python

import the objects you want to use

or you can define specific objects by yourself, just make sure obeying our specification

from safetygymnasium.assets.geoms import Apples from safetygymnasium.bases import BaseTask

inherit the basetask

class MytaskLevel0(BaseTask): def init(self, config): super().init(config=config) # define some properties self.numsteps = 500 self.agent.placements = [(-0.8, -0.8, 0.8, 0.8)] self.agent.keepout = 0 self.lidarconf.maxdist = 6 # add objects into environments self.addgeoms(Apples(num=2, size=0.3))

def calculate_reward(self):
    # implement your reward function
    # Note: cost calculation is based on objects, so it's automatic
    reward = 1
    return reward

def specific_reset(self):
    # depending on your task

def specific_step(self):
    # depending on your task

def update_world(self):
    # depending on your task

@property
def goal_achieved(self):
    # depending on your task

```

License

Safety-Gymnasium is released under Apache License 2.0.

Owner

Name: PKU-Alignment
Login: PKU-Alignment
Kind: organization
Email: yaodong.yang@outlook.com
Location: China

Repositories: 3
Profile: https://github.com/PKU-Alignment

Loves Sharing and Open-Source, Making AI Safer.

GitHub Events

Total

Issues event: 5
Watch event: 92
Issue comment event: 10
Pull request review event: 1
Pull request event: 4
Fork event: 14
Create event: 1

Last Year

Issues event: 5
Watch event: 92
Issue comment event: 10
Pull request review event: 1
Pull request event: 4
Fork event: 14
Create event: 1

Committers

Last synced: over 2 years ago

All Time

Total Commits: 65
Total Committers: 9
Avg Commits per committer: 7.222
Development Distribution Score (DDS): 0.431

Past Year

Commits: 65
Committers: 9
Avg Commits per committer: 7.222
Development Distribution Score (DDS): 0.431

Top Committers

Name	Email	Commits
muchvo	1****o	37
Xuehai Pan	X**n@p**n	9
zmsn-2077	7****7	9
Jiayi Zhou	1****j	4
WeidongHuang	4****g	2
Ruiyang Sun	r**2@g**m	1
pre-commit-ci[bot]	6****]	1
dependabot[bot]	4****]	1
muchvo	m**q@y**t	1

Committer Domains (Top 20 + Academic)

yeah.net: 1 pku.edu.cn: 1

Issues and Pull Requests

Last synced: 9 months ago

All Time

Total issues: 22
Total pull requests: 32
Average time to close issues: 2 months
Average time to close pull requests: 20 days
Total issue authors: 21
Total pull request authors: 9
Average comments per issue: 1.95
Average comments per pull request: 0.25
Merged pull requests: 28
Bot issues: 0
Bot pull requests: 3

Past Year

Issues: 9
Pull requests: 5
Average time to close issues: N/A
Average time to close pull requests: about 14 hours
Issue authors: 9
Pull request authors: 3
Average comments per issue: 0.78
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 1

View more stats

Top Authors

Issue Authors

aivarsoo (2)
s8phsaue (1)
hmnhonari (1)
BeFranke (1)
zhaoxuesi (1)
team-daniel (1)
ZhHe11 (1)
pulak-gautam (1)
FrankSinatral (1)
fardinabbasi (1)
ameesh-shah (1)
kkkx0 (1)
SimonZhan-code (1)
jamesarambam (1)
chan-yuu (1)

Pull Request Authors

muchvo (15)
Gaiejj (4)
dependabot[bot] (4)
zmsn-2077 (4)
BeFranke (2)
aivarsoo (2)
Ethyn13 (2)
hdadong (2)
pseudo-rnd-thoughts (2)

Top Labels

Issue Labels

question (16) bug (3) enhancement (2)

Pull Request Labels

dependencies (4) documentation (2) enhancement (1) dependency (1)

Packages

Total packages: 1
Total downloads:
- pypi 718 last-month
Total docker downloads: 53

Total dependent packages: 2
Total dependent repositories: 1
Total versions: 13
Total maintainers: 4

pypi.org: safety-gymnasium

A highly scalable and customizable safe reinforcement learning environment.

Homepage: https://github.com/PKU-Alignment/safety-gymnasium
Documentation: https://www.safety-gymnasium.com
License: Apache License, Version 2.0
Latest release: 1.2.1
published almost 3 years ago

Versions: 13
Dependent Packages: 2
Dependent Repositories: 1
Downloads: 718 Last month
Docker Downloads: 53

Rankings

Dependent packages count: 3.2%

Docker downloads count: 3.4%

Stargazers count: 4.4%

Forks count: 7.4%

Average: 8.4%

Downloads: 10.8%

Dependent repos count: 21.5%

Maintainers (4)

XuehaiPan jiamingV3 BorongZhang gaiejj

Last synced: 10 months ago

safety-gymnasium

Science Score: 36.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Safety-Gymnasium

Citing Safety-Gymnasium

Note for v1.1.0 and v1.2.0

Why Safety-Gymnasium?

Environments

Agents

Tasks

Vision-based Safe RL

Environment Usage

Safety-Gymnasium API: step returns (next_obervation, reward, cost, terminated, truncated, info)

Gymnasium API: step returns (next_obervation, reward, terminated, truncated, info) and cost is in the info dict associated with a str key 'cost'

Installation

Install from PyPI

Install from source

Important Notes

Debug with your keyboard

Customize your environments

import the objects you want to use

or you can define specific objects by yourself, just make sure obeying our specification

inherit the basetask

License

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: safety-gymnasium

Rankings

Maintainers (4)

Dependencies

Gymnasium API: step returns (next_obervation, reward, terminated, truncated, info) and cost is in the `info` dict associated with a str key `'cost'`