sigmarl

SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning

Keywords

autonomous-driving autonomous-vehicles connected-and-automated-vehicles connected-vehicle marl motion-planing multi-agent-reinforcement-learning pytorch reinforcement-learning

Last synced: 9 months ago · JSON representation

Repository

SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning

Basic Info

Host: GitHub
Owner: bassamlab
License: other
Language: Python
Default Branch: main
Homepage:
Size: 59.7 MB

Statistics

Stars: 58
Watchers: 1
Forks: 1
Open Issues: 1
Releases: 6

Topics

autonomous-driving autonomous-vehicles connected-and-automated-vehicles connected-vehicle marl motion-planing multi-agent-reinforcement-learning pytorch reinforcement-learning

Created about 2 years ago · Last pushed 9 months ago

Metadata Files

Readme License

SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning

SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning

[!NOTE] - Check out our recent work CBF-Based Safety Filter! It proposes a real-time CBF-based safety filter for safety verification of learning-based motion planning with road boundary constraints (see also Fig. 5). - Check out our recent work Truncated Taylor CBF! It proposes a new notion of high-order CBFs termed Truncated Taylor CBF (TTCBF). TTCBF can handle constraints with arbitrary relative degrees while using only one design parameter to facilitate control design (see also Fig. 4).

Welcome to SigmaRL!

This repository provides the full code of SigmaRL, a Sample efficiency and generalization multi-agent Reinforcement Learning (MARL) for motion planning of Connected and Automated Vehicles (CAVs).

SigmaRL is a decentralized MARL framework designed for motion planning of CAVs. We use VMAS, a vectorized differentiable simulator designed for efficient MARL benchmarking, as our simulator and customize our own RL environment. The first scenario in Fig. 1 mirrors the real-world conditions of our Cyber-Physical Mobility Lab (CPM Lab). We also support maps handcrafted in JOSM, an open-source editor for OpenStreetMap. Below you will find detailed guidance to create your OWN maps.

Figure 1: Demonstrating the generalization of SigmaRL (speed x2). Only the intersection part of the CPM scenario (the middle part in Fig. 1(a)) is used for training. All other scenarios are completely unseen. See our SigmaRL paper for more details.

Figure 2: We use an auxiliary MARL to learn dynamic priority assignments to address non-stationarity. Higher-priority agents communicate their actions (depicted by the colored lines) to lower-priority agents to stabilize the environment. See our XP-MARL paper for more details.

(a) Overtaking scenario with Center-to-Center (C2C)-based safety margin (traditional).

Figure 3: Demonstrating the safety and reduced conservatism of our MTV-based safety margin. In the overtaking scenario, while the traditional approach fails to overtake due to excessive conservatism (see (a)), ours succeeds (see (b)). Note that in the overtaking scenario, the slow-moving vehicle $j$ purposely obstructs vehicle $i$ three times to prevent it from overtaking. In the bypassing scenario, while the traditional approach requires a large lateral space due to excessive conservatism (see (c)), ours requires a smaller one (see (d)). See our MTV-Based CBF paper for more details.

(a) The standard HOCBF approach requires tuning two parameters (lambda1 and lambda2).

Figure 4: Our TTCBF approach reduces the number of parameters to tune when handling constraints with high relative degrees. See our TTCBF paper for more details.

(a) An undertrained RL policy without our safety filter often caused collisions with road boundaries.

Figure 5: Demonstration of our safety filter for safety verification of an undertrained RL policy. See our CBF-Based Safety Filter Paper for more details.

Install

SigmaRL supports Python versions from 3.9 to 3.12 and is also OS independent (Windows/macOS/Linux). It's recommended to use a virtual environment. For example, if you are using conda: bash conda create -n sigmarl python=3.12 conda activate sigmarl We recommend installing sigmarl from source: - Clone the repository bash git clone https://github.com/bassamlab/SigmaRL.git cd SigmaRL pip install -e . - (Optional) Verifying the Installation by first launching your Python interpreter in terminal: bash python Then run the following lines, which should show the version of the installed sigmarl: bash import sigmarl print(sigmarl.__version__)

How to Use

Training

Run main_training.py. During training, all the intermediate models that have higher performance than the saved one will be automatically saved. You are also allowed to retrain or refine a trained model by setting the parameter is_continue_train in the file sigmarl/config.json to true. The saved model will be loaded for a new training process.

sigmarl/scenarios/road_traffic.py defines the RL environment, such as the observation function and reward function. Besides, it provides an interactive interface, which also visualizes the environment. To open the interface, simply run this file. You can use arrow keys to control agents and use the tab key to switch between agents. Adjust the parameter scenario_type to choose a scenario. All available scenarios are listed in the variable SCENARIOS in sigmarl/constants.py. Before training, it is recommended to use the interactive interface to check if the environment is as expected.

Testing

After training, run main_testing.py to test your model. You may need to adjust the parameter path therein to tell which folder the target model was saved. Note: If the path to a saved model changes, you need to update the value of where_to_save in the corresponding JSON file as well.

Customize Your Own Maps

We support maps customized in JOSM, an open-source editor for OpenStreetMap. Follow these steps (video tutorial available here): - Install JOSM from the website given above. - To get an empty map that can be customized, do the following: - Open JOSM and click the green download button - Zoom in and choose an arbitrary place on the map by drawing a rectangle. The area should be as empty as possible. - Clicking "Download" will open a new window. There should be the notification that no data could get found, otherwise redo choosing the area. - Customize the map by drawing lines. Note that all lanes you draw are considered center lines. You do not need to draw left and right boundaries, since they will be determined automatically later by our script with a given width. The distance between the nodes of a lane should be approximatly 0.1 meters. You can find useful hints and commands for customizing the map at Actions and Tools. - Give each lane the key "lanes" and an unique value. - Save the resulting .osm file and store it at assets/maps. Give it a name. - Go to utilities/constants.py and create a new entry in the dictionary "SCENARIOS" for it. The key of the entry is the name of the map and the value is a dictionary, for which you should at least give the value for the key map_path, lane_width, and scale. Also you should provide a list for reference_paths_ids (which paths exist?) and a dictionary for neighboring_lanelet_ids (which lanes are adjacent?). - Go to utilities/parse_osm.py. Adjust the parameters scenario_type and run it.

Overview Map Figure 6: Overview of currently available maps.

Papers

If you use this repository, please consider to cite our papers.

1. SigmaRL

Jianye Xu, Pan Hu, and Bassam Alrifaee, "SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning," 2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), Edmonton, AB, Canada, 2024, pp. 768-775, doi: 10.1109/ITSC58415.2024.10919918.

[![Jump to Fig. 1](https://img.shields.io/badge/Jump%20to-Fig.%201-blue)](#fig-generalization)

BibTeX bibtex @inproceedings{xu2024sigmarl, title = {SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning}, booktitle = {2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC)}, author = {Xu, Jianye and Hu, Pan and Alrifaee, Bassam}, year = {2024}, pages = {768--775}, issn = {2153-0017}, doi = {10.1109/ITSC58415.2024.10919918} }
Reproduce Experimental Results in the Paper:
- Git checkout to the corresponding tag using git checkout 1.2.0
- Go to this page and download the zip file itsc24.zip. Unzip it, copy and paste the whole folder to the checkpoints folder at the root of this repository. The structure should be like this: root/checkpoints/itsc24/.
- Run sigmarl/evaluation_itsc24.py.

You can also run testing_mappo_cavs.py to intuitively evaluate the trained models. Adjust the parameter path therein to specify which folder the target model was saved. Note: The evaluation results you get may deviate from the paper since we have meticulously adjusted the performance metrics.

2. XP-MARL

Jianye Xu, Omar Sobhy, and Bassam Alrifaee, "XP-MARL: Auxiliary Prioritization in Multi-Agent Reinforcement Learning to Address Non-Stationarity," arXiv preprint arXiv:2409.11852, 2024.

[![Jump to Fig. 2](https://img.shields.io/badge/Jump%20to-Fig.%202-blue)](#fig-xp-marl)

BibTeX bibtex @article{xu2024xp, title={{{XP-MARL}}: Auxiliary Prioritization in Multi-Agent Reinforcement Learning to Address Non-Stationarity}, author={Xu, Jianye and Sobhy, Omar and Alrifaee, Bassam}, journal={arXiv preprint arXiv:2409.11852}, year={2024}, }
Reproduce Experimental Results in the Paper:
- Git checkout to the corresponding tag using git checkout 1.2.0
- Go to this page and download the zip file icra25.zip. Unzip it, copy and paste the whole folder to the checkpoints folder at the root of this repository. The structure should be like this: root/checkpoints/icra25/.
- Run sigmarl/evaluation_icra25.py.

You can also run testing_mappo_cavs.py to intuitively evaluate the trained models. Adjust the parameter path therein to specify which folder the target model was saved.

3. MTV-Based CBF

Jianye Xu and Bassam Alrifaee, "Learning-Based Control Barrier Function with Provably Safe Guarantees: Reducing Conservatism with Heading-Aware Safety Margin," In European Control Conference (ECC), in press, 2024.

[![Jump to Fig. 3](https://img.shields.io/badge/Jump%20to-Fig.%203-blue)](#fig-mtv-based-cbf)

BibTeX bibtex @inproceedings{xu2024learningbased, title = {Learning-Based Control Barrier Function with Provably Safe Guarantees: Reducing Conservatism with Heading-Aware Safety Margin}, shorttitle = {Learning-Based Control Barrier Function with Provably Safe Guarantees}, booktitle = {European Control Conference (ECC), in Press}, author = {Xu, Jianye and Alrifaee, Bassam}, year = {2025}, }
Reproduce Experimental Results in the Paper:

- Go to this page and download the zip file ecc25.zip. Unzip it, copy and paste the whole folder to the checkpoints folder at the root of this repository. The structure should be like this: root/checkpoints/ecc25/. - Run sigmarl/evaluation_ecc25.py.

4. Truncated Taylor CBF (TTCBF)

Jianye Xu and Bassam Alrifaee, "High-Order Control Barrier Functions: Insights and a Truncated Taylor-Based Formulation," arXiv preprint arXiv:2503.15014, 2025.

[![Jump to Fig. 4](https://img.shields.io/badge/Jump%20to-Fig.%204-blue)](#fig-ttcbf)

BibTeX bibtex @article{xu2025highorder, title = {High-Order Control Barrier Functions: Insights and a Truncated Taylor-Based Formulation}, author = {Xu, Jianye and Alrifaee, Bassam}, journal = {arXiv preprint arXiv:2503.15014}, year = {2025}, }
Reproduce Experimental Results in the Paper:
- Git checkout to the corresponding tag using git checkout 1.3.0
- Run sigmarl/hocbf_taylor.py.

5. CBF-Based Safety Filter

Jianye Xu, Chang Che, and Bassam Alrifaee, "A Real-Time Control Barrier Function-Based Safety Filter for Motion Planning with Arbitrary Road Boundary Constraints," arXiv preprint arXiv:2505.02395, 2025.

[![Jump to Fig. 5](https://img.shields.io/badge/Jump%20to-Fig.%205-blue)](#fig-safety-filter)

BibTeX bibtex @article{xu2025realtime, title = {A Real-Time Control Barrier Function-Based Safety Filter for Motion Planning with Arbitrary Road Boundary Constraints}, author = {Xu, Jianye and Che, Chang and Alrifaee, Bassam}, journal = {arXiv preprint arXiv:2505.02395}, year = {2025}, }
Reproduce Experimental Results in the Paper:
- Git checkout to the corresponding tag using git checkout 1.4.0
- Go to this page and download the zip file itsc25.zip. Unzip it, copy and paste the whole folder to the checkpoints folder at the root of this repository. The structure should be like this: root/checkpoints/itsc25/.
- Run sigmarl/evaluation_itsc25.py.

6. CPM Lab Benchmark

Julius Beerwerth, Jianye Xu, Simon Schäfer, Fynn Belderink, and Bassam Alrifaee, "From Simulation to Reality: A Benchmark for MARL in the Cyber-Physical Mobility Lab," arXiv preprint arXiv:TBD, 2025.

Reproduce Experimental Results of the SigmaRL Simulation in the Paper:
- Git checkout to the corresponding tag using git checkout 1.5.0
- Go to this page and download the zip file at25.zip. Unzip it, copy and paste the whole folder to the checkpoints folder at the root of this repository. The structure should be like this: root/checkpoints/at25/.
- Run sigmarl/eva_at25/run_models_parallel.py to evaluate the downloaded models. The evaluation results will be saved automatically.
- This script requires Python parallel workers.
- Alternatively, you can run sigmarl/eva_at25/run_models.py if you do not want to use parallel workers.
- After the evaluation, run sigmarl/eva_at25/marl_aggregated_evaluation.py to analyze the evaluation results and obtain the performance metrics.

TODOs

Improve safety
- [ ] Integrating Control Barrier Functions (CBFs)
- [x] Proof of concept with two agents (see the MTV-Based CBF paper here)
- [x] High-Order CBFs (see the TTCBF paper here)
- [x] Collision aovidance with road boundaries (see the CBF-Based Safety Filter paper here)
- [ ] Integrating Model Predictive Control (MPC)
Address non-stationarity
- [x] Integrating prioritization (see the XP-MARL paper here)
Effective observation design
- [ ] Image-based representation of observations
- [ ] Historic observations
- [ ] Attention mechanism
Misc
- [x] OpenStreetMap support (see guidance here)
- [x] Contribute our CPM scenario as an MARL benchmark scenario in VMAS (see news here)
- [x] Update to the latest versions of Torch, TorchRL, and VMAS
- [x] Support Python 3.11+

Acknowledgments

This research was supported by the Bundesministerium für Digitales und Verkehr (German Federal Ministry for Digital and Transport) within the project "Harmonizing Mobility" (grant number 19FS2035A).

Owner

Name: Bassam Lab
Login: bassamlab
Kind: organization
Location: Germany

Repositories: 1
Profile: https://github.com/bassamlab

Control of Autonomus Systems Lab, University of the Bundeswehr Munich

GitHub Events

Total

Create event: 2
Release event: 2
Issues event: 2
Watch event: 33
Issue comment event: 8
Push event: 14
Fork event: 1

Last Year

Create event: 2
Release event: 2
Issues event: 2
Watch event: 33
Issue comment event: 8
Push event: 14
Fork event: 1

(a) CPM scenario.	(b) Intersection scenario.
(c) On-ramp scenario.	(d) "Roundabout" scenario.

(a) Overtaking scenario with Center-to-Center (C2C)-based safety margin (traditional).	(b) Overtaking scenario with Minimum Translation Vector (MTV)-based safety margin (ours).
(c) Bypassing scenario with C2C-based safety margin (traditional).	(d) Bypassing scenario with MTV-based safety margin (ours).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

sigmarl

Science Score: 49.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning

Welcome to SigmaRL!

Install

How to Use

Training

Testing

Customize Your Own Maps

Papers

1. SigmaRL

2. XP-MARL

3. MTV-Based CBF

4. Truncated Taylor CBF (TTCBF)

5. CBF-Based Safety Filter

6. CPM Lab Benchmark

TODOs

Acknowledgments

Owner

GitHub Events

Total

Last Year