safety-gymnasium
NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 9 committers (11.1%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.0%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Basic Info
- Host: GitHub
- Owner: PKU-Alignment
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://safety-gymnasium.readthedocs.io/en/latest/
- Size: 490 MB
Statistics
- Stars: 478
- Watchers: 8
- Forks: 66
- Open Issues: 10
- Releases: 12
Topics
Metadata Files
README.md
Safety-Gymnasium
Why Safety-Gymnasium? | Documentation | Install guide | Customization | Video
Safety-Gymnasium is a highly scalable and customizable Safe Reinforcement Learning (SafeRL) library. It aims to deliver a good view of benchmarking SafeRL algorithms and a standardized set of environments. We provide a set of standard APIs which are compatible with information on constraints. Users can explore new insights via an elegant code framework and well-designed environments.
Citing Safety-Gymnasium
If you find Safety-Gymnasium useful, please cite it in your publications.
bibtex
@inproceedings{ji2023safety,
title={Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark},
author={Jiaming Ji and Borong Zhang and Jiayi Zhou and Xuehai Pan and Weidong Huang and Ruiyang Sun and Yiran Geng and Yifan Zhong and Josef Dai and Yaodong Yang},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year={2023},
url={https://openreview.net/forum?id=WZmlxIuIGR}
}
Note for v1.1.0 and v1.2.0
We have updated the environments for both the Safe Vision series and the Safe Isaac Gym series. However, due to package size constraints, we have not yet uploaded versions v1.1.0 and v1.2.0 to PyPI. As a result, users are required to manually download and install. We currently recommend using GitHub's Download zip feature to obtain our package and access the latest environments. In the future, we plan to deploy resources separately to a cloud service to accommodate PyPI. Stay tuned for further updates.
Python 3.11 is not supported for now, due to the incompatibility of pygame.
bash
conda create -n example python=3.8
conda activate example
wget https://github.com/PKU-Alignment/safety-gymnasium/archive/refs/heads/main.zip
unzip main.zip
cd safety-gymnasium-main
pip install -e .
Why Safety-Gymnasium?
Here we provide a table for comparison of Safety-Gymnasium and existing SafeRL Environments libraries.
| SafeRL
Envs | Engine | Vectorized
Environments | New Gym API(3) | Vision Input |
| :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------: | :--------------------------: | :---------------------------: | :-----------------: |
| Safety-Gym |
mujoco-py(1) | | | minimally supported |
| safe-control-gym | PyBullet | | | |
| Velocity-Constraints(2) | N/A | | | | |
| mujoco-circle
| PyTorch | | | | |
| Safety-Gymnasium
| MuJoCo 2.3.0+ | | | |
(1): Maintenance (expect bug fixes and minor updates); the last commit is 19 Nov 2021. Safety-Gym depends on mujoco-py 2.0.2.7, which was updated on Oct 12, 2019.
(2): There is no official library for speed-related environments, and its associated cost constraints are constructed from info. But the task is widely used in the study of SafeRL, and we encapsulate it in Safety-Gymnasium.
(3): In the gym 0.26.0 release update, a new API of interaction was redefined.
Environments
We designed a variety of safety-enhanced learning tasks and integrated the contributions from the RL community: safety-velocity, safety-run, safety-circle, safety-goal, safety-button, etc.
We introduce a unified safety-enhanced learning benchmark environment library called Safety-Gymnasium.
Further, to facilitate the progress of community research, we redesigned Safety-Gym and removed the dependency on mujoco-py.
We built it on top of MuJoCo and fixed some bugs, more specific bug reports can refer to Safety-Gym's BUG Report.
Here is a list of all the environments we support for now:
| Category | Task | Agent | Example |
|---|---|---|---|
| Safe Navigation | Button[012] | Point, Car, Doggo, Racecar, Ant | SafetyPointGoal1-v0 |
| Goal[012] | |||
| Push[012] | |||
| Circle[012] | |||
| Safe Velocity | Velocity | HalfCheetah, Hopper, Swimmer, Walker2d, Ant, Humanoid | SafetyAntVelocity-v1 |
| Safe Vision | BuildingButton[012] | Point, Car, Doggo, Racecar, Ant | SafetyFormulaOne1-v0 |
| BuildingGoal[012] | |||
| BuildingPush[012] | |||
| FadingEasy[012] | |||
| FadingHard[012] | |||
| Race[012] | |||
| FormulaOne[012] | |||
| Safe Multi-Agent | MultiGoal[012] | Multi-Point, Multi-Ant | SafetyAntMultiGoal1-v0 |
| Multi-Agent Velocity | 6x1HalfCheetah, 2x3HalfCheetah, 3x1Hopper, 2x1Swimmer, 2x3Walker2d, 2x4Ant, 4x2Ant, 9|8Humanoid | Safety2x4AntVelocity-v0 | |
| FreightFrankaCloseDrawer(Multi-Agent) | FreightFranka | FreightFrankaCloseDrawer(Multi-Agent) | |
| FreightFrankaPickAndPlace(Multi-Agent) | |||
| ShadowHandCatchOver2UnderarmSafeFinger(Multi-Agent) | ShadowHands | ShadowHandCatchOver2UnderarmSafeJoint(Multi-Agent) | |
| ShadowHandCatchOver2UnderarmSafeJoint(Multi-Agent) | |||
| ShadowHandOverSafeFinger(Multi-Agent) | |||
| ShadowHandOverSafeJoint(Multi-Agent) | |||
| Safe Isaac Gym | FreightFrankaCloseDrawer | FreightFranka | FreightFrankaCloseDrawer |
| FreightFrankaPickAndPlace | |||
| ShadowHandCatchOver2UnderarmSafeFinger | ShadowHands | ShadowHandCatchOver2UnderarmSafeJoint | |
| ShadowHandCatchOver2UnderarmSafeJoint | |||
| ShadowHandOverSafeFinger | |||
| ShadowHandOverSafeJoint |
Here are some screenshots of the Safe Navigation tasks.
Agents
|
|
|
|
|
Tasks
|
|
|
|
|
|
|
|
|
|
|
|
Vision-based Safe RL
Vision-based SafeRL lacks realistic scenarios.
Although the original Safety-Gym could minimally support visual input, the scenarios were too similar.
To facilitate the validation of visual-based SafeRL algorithms, we have developed a set of realistic vision-based SafeRL tasks, which are currently being validated on the baseline.
For the appetizer, the images are as follows:
|
|
|
|
|
|
Environment Usage
Notes: We support explicitly expressing the cost based on Gymnasium APIs.
The step method returns 6 items (next_obervation, reward, cost, terminated, truncated, info) with an extra cost field.
```python import safety_gymnasium
envid = 'SafetyPointGoal1-v0' env = safetygymnasium.make(env_id)
obs, info = env.reset() while True: act = env.action_space.sample() obs, reward, cost, terminated, truncated, info = env.step(act) if terminated or truncated: break env.render() ```
We also provide two convenience wrappers for converting the Safety-Gymnasium environment to the standard Gymnasium API and vice versa.
```python
Safety-Gymnasium API: step returns (next_obervation, reward, cost, terminated, truncated, info)
Gymnasium API: step returns (next_obervation, reward, terminated, truncated, info) and cost is in the info dict associated with a str key 'cost'
safetygymnasiumenv = safetygymnasium.make(envid) gymnasiumenv = safetygymnasium.wrappers.SafetyGymnasium2Gymnasium(safetygymnasiumenv)
safetygymnasiumenv = safetygymnasium.wrappers.Gymnasium2SafetyGymnasium(gymnasiumenv) ```
Users can apply Gymnasium wrappers easily with:
```python import gymnasium import safety_gymnasium
def makesafeenv(envid): safeenv = safetygymnasium.make(envid) env = safetygymnasium.wrappers.SafetyGymnasium2Gymnasium(safeenv) env = gymnasium.wrappers.SomeWrapper1(env) env = gymnasium.wrappers.SomeWrapper2(env, argname1=arg1, argname2=arg2) ... env = gymnasium.wrappers.SomeWrapperN(env) safeenv = safetygymnasium.wrappers.Gymnasium2SafetyGymnasium(env) return safe_env ```
or
```python import functools
import gymnasium import safety_gymnasium
def makesafeenv(envid): return safetygymnasium.wrappers.withgymnasiumwrappers( safetygymnasium.make(envid), gymnasium.wrappers.SomeWrapper1, functools.partial(gymnasium.wrappers.SomeWrapper2, argname1=arg1, argname2=arg2), ..., gymnasium.wrappers.SomeWrapperN, ) ```
In addition, for all Safety-Gymnasium environments, we also provide corresponding Gymnasium environments with a suffix Gymnasium in the environment id. For example:
```python import gymnasium import safety_gymnasium
safetygymnasium.make('SafetyPointGoal1-v0') # step returns (nextobervation, reward, cost, terminated, truncated, info) gymnasium.make('SafetyPointGoal1Gymnasium-v0') # step returns (next_obervation, reward, terminated, truncated, info) ```
Installation
Install from PyPI
bash
pip install safety-gymnasium
Install from source
```bash
conda create -n
git clone https://github.com/PKU-Alignment/safety-gymnasium.git cd safety-gymnasium pip install -e . ```
Important Notes
If you failed to render on your server, you can try:
bash
echo "export MUJOCO_GL=osmesa" >> ~/.bashrc
source ~/.bashrc
apt-get install libosmesa6-dev
apt-get install python3-opengl
Debug with your keyboard
For simple agents, we offer the capability to control the robot's movement via the keyboard, facilitating debugging. Simply append a Debug suffix to the task name, such as SafetyCarGoal2Debug-v0, and utilize the keys I, K, J, and L to guide the robot's movement.
For more intricate agents, you can also craft custom control logic based on specific peripherals. To achieve this, implement the debug method from the BaseAgent for the designated agent.
Customize your environments
We construct a highly expandable framework of code so that you can easily comprehend it and design your environments to facilitate your research with no more than 100 lines of code on average.
For details, please refer to our documentation. Here is a minimal example:
```python
import the objects you want to use
or you can define specific objects by yourself, just make sure obeying our specification
from safetygymnasium.assets.geoms import Apples from safetygymnasium.bases import BaseTask
inherit the basetask
class MytaskLevel0(BaseTask): def init(self, config): super().init(config=config) # define some properties self.numsteps = 500 self.agent.placements = [(-0.8, -0.8, 0.8, 0.8)] self.agent.keepout = 0 self.lidarconf.maxdist = 6 # add objects into environments self.addgeoms(Apples(num=2, size=0.3))
def calculate_reward(self):
# implement your reward function
# Note: cost calculation is based on objects, so it's automatic
reward = 1
return reward
def specific_reset(self):
# depending on your task
def specific_step(self):
# depending on your task
def update_world(self):
# depending on your task
@property
def goal_achieved(self):
# depending on your task
```
License
Safety-Gymnasium is released under Apache License 2.0.
Owner
- Name: PKU-Alignment
- Login: PKU-Alignment
- Kind: organization
- Email: yaodong.yang@outlook.com
- Location: China
- Repositories: 3
- Profile: https://github.com/PKU-Alignment
Loves Sharing and Open-Source, Making AI Safer.
GitHub Events
Total
- Issues event: 5
- Watch event: 92
- Issue comment event: 10
- Pull request review event: 1
- Pull request event: 4
- Fork event: 14
- Create event: 1
Last Year
- Issues event: 5
- Watch event: 92
- Issue comment event: 10
- Pull request review event: 1
- Pull request event: 4
- Fork event: 14
- Create event: 1
Committers
Last synced: over 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| muchvo | 1****o | 37 |
| Xuehai Pan | X****n@p****n | 9 |
| zmsn-2077 | 7****7 | 9 |
| Jiayi Zhou | 1****j | 4 |
| WeidongHuang | 4****g | 2 |
| Ruiyang Sun | r****2@g****m | 1 |
| pre-commit-ci[bot] | 6****] | 1 |
| dependabot[bot] | 4****] | 1 |
| muchvo | m****q@y****t | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 22
- Total pull requests: 32
- Average time to close issues: 2 months
- Average time to close pull requests: 20 days
- Total issue authors: 21
- Total pull request authors: 9
- Average comments per issue: 1.95
- Average comments per pull request: 0.25
- Merged pull requests: 28
- Bot issues: 0
- Bot pull requests: 3
Past Year
- Issues: 9
- Pull requests: 5
- Average time to close issues: N/A
- Average time to close pull requests: about 14 hours
- Issue authors: 9
- Pull request authors: 3
- Average comments per issue: 0.78
- Average comments per pull request: 0.0
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 1
Top Authors
Issue Authors
- aivarsoo (2)
- s8phsaue (1)
- hmnhonari (1)
- BeFranke (1)
- zhaoxuesi (1)
- team-daniel (1)
- ZhHe11 (1)
- pulak-gautam (1)
- FrankSinatral (1)
- fardinabbasi (1)
- ameesh-shah (1)
- kkkx0 (1)
- SimonZhan-code (1)
- jamesarambam (1)
- chan-yuu (1)
Pull Request Authors
- muchvo (15)
- Gaiejj (4)
- dependabot[bot] (4)
- zmsn-2077 (4)
- BeFranke (2)
- aivarsoo (2)
- Ethyn13 (2)
- hdadong (2)
- pseudo-rnd-thoughts (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 718 last-month
- Total docker downloads: 53
- Total dependent packages: 2
- Total dependent repositories: 1
- Total versions: 13
- Total maintainers: 4
pypi.org: safety-gymnasium
A highly scalable and customizable safe reinforcement learning environment.
- Homepage: https://github.com/PKU-Alignment/safety-gymnasium
- Documentation: https://www.safety-gymnasium.com
- License: Apache License, Version 2.0
-
Latest release: 1.2.1
published almost 3 years ago
Rankings
Maintainers (4)
Dependencies
- actions/checkout v3 composite
- actions/setup-python v4 composite
- furo *
- moviepy *
- myst-parser *
- pygame *
- sphinx *
- sphinx-autobuild *
- sphinx-design *
- sphinx_github_changelog *
- gymnasium ==0.26.3
- imageio ==2.25.0
- mujoco ==2.3.0
- pygame ==2.1.0
- pyyaml ==6.0
- xmltodict ==0.13.0






















