marl_mrt2a
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, springer.com, ieee.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.8%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: lcdbezerra
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 3.01 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation
This repository contains the code and simulation environments used in the paper:
"Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation"
Lucas C. D. Bezerra, Ataíde M. G. dos Santos, and Shinkyu Park
Accepted: IEEE Robotics and Automation Letters, 2025
TL;DR:
This repository contains:
- GyMRT²A Environment: a Gym environment for discrete-space, discrete-time MRTA with multi-robot tasks.
- MARL-MRT²A: a MAPPO-based algorithm that enables learning of decentralized, low-communication, generalizable task allocation policies for MRT²A; this implementation builds upon the EPyMARL codebase.
- PCFA Baseline: our implementation of the decentralized market-based approach PCFA.
To get started, please see the Installation section and run the provided examples.
Overview
We propose a decentralized, learning-based framework for dynamic coalition formation in Multi-Robot Task Allocation (MRTA) under partial observability. Our method extends MAPPO with multiple integrated components that allow robots to coordinate and revise task assignments in dynamic, partially observable environments.
Key Components
- Spatial Action Maps: Agents select task locations in spatial coordinates, enabling long-horizon task planning.
- Robot Motion Planning: Each robot computes a collision-free A* path to the selected task.
- Intention Sharing: Robots share decayed path-based intention maps with nearby agents to support coordination.
- Custom Policy Architecture: We propose using a U-Net as the policy architecture, but our code supports custom architecture (that can be implemented as a
torch.nn.Sequential; seenn_utils.pyfor modules that are currently available)
Environment
We implement our experiments in a custom Gym environment called GyMRT²A, which simulates:
- Grid-world task allocation with dynamic task spawns (Bernoulli or instant respawn)
- Partial observability (limited view and communication ranges)
- Multi-level tasks requiring varying coalition sizes
- Motion planning with obstacles and other agents
Repository Structure
marl_mrt2a/
├── marl_mrt2a/
│ ├── env/ # GyMRT²A environment
│ ├── PCFA/ # Baseline implementation
│ ├── marl/ # Our method's implementation
│ └── examples/ # Reproducible experiments
│ └── main_comparison/ # Comparison with baseline and ablation studies
├── LICENSE
└── README.md
Installation
Prerequisites
- Python 3.10 or higher
- PyTorch
- NumPy
- OpenAI Gym
Setup
Clone the repository:
bash git clone https://github.com/lcdbezerra/marl_mrt2a.git cd marl_mrt2aCreate a conda environment:
bash conda create -n marl_mrt2a python=3.10 -y conda activate marl_mrt2a conda install pip -yInstall the base environment and the baseline (development mode)
bash cd marl_mrt2a/env pip install -e . pip install -U pygame --user conda install -c conda-forge libstdcxx-ng -y cd ../PCFA pip install -e . cd ../Setup Weights & Biases for experiment tracking
bash pip install wandb wandb loginInstall MARL dependencies:
bash cd marl pip install -r requirements.txt
Experiments
Experiments are reproducible through the examples in the examples/ directory:
Main Comparison (examples/main_comparison/)
Compare the proposed method against baseline approaches: - Traditional task allocation methods - Standard MAPPO - Other multi-agent learning approaches
Running Experiments
```bash
Run main comparison experiments
python examples/maincomparison/runcomparison.py
<!--bash
Run main comparison experiments
python examples/maincomparison/runcomparison.py
Run scalability experiments
python examples/scalability/run_scalability.py
Run generalizability experiments
python examples/generalizability/run_generalizability.py
Watch trained agents
python examples/watchepisode/watchepisode.py ``` -->
Citation
If you use this code, please cite:
bibtex
@article{bezerra2025learningdcfmrta,
author={Lucas C. D. Bezerra and Ataíde M. G. dos Santos and Shinkyu Park},
journal={IEEE Robotics and Automation Letters},
title={Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation},
year={2025},
volume={10},
number={9},
pages={9216-9223},
doi={10.1109/LRA.2025.3592080}}
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
For information about derivative works and third-party components, see the NOTICE file.
Third-Party Code
This repository includes code under the Apache License 2.0:
- Multi-agent reinforcement learning framework based on EPyMARL and PyMARL
- astar.py – A* pathfinding implementation from Red Blob Games, Copyright 2014 Red Blob Games, licensed under Apache License 2.0. Adapted by Lucas C. D. Bezerra.
Owner
- Name: Lucas Bezerra
- Login: lcdbezerra
- Kind: user
- Company: KAUST
- Website: lucascamara.com
- Twitter: lcdbezerra
- Repositories: 1
- Profile: https://github.com/lcdbezerra
PhD Student @ KAUST. Reinforcement Learning researcher.
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "C. D. Bezerra"
given-names: "Lucas"
orcid: "https://orcid.org/0000-0002-3967-4374"
- family-names: "M. G. dos Santos"
given-names: "Ataíde"
orcid: "https://orcid.org/0000-0003-2725-1734"
- family-names: "Park"
given-names: "Shinkyu"
orcid: "https://orcid.org/0000-0002-8643-404X"
title: "Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation"
version: "1.0.0"
date-released: 2025-02-22
url: "https://github.com/lcdbezerra/marl_mrt2a"
license: Apache-2.0
preferred-citation:
type: article
authors:
- family-names: "C. D. Bezerra"
given-names: "Lucas"
orcid: "https://orcid.org/0000-0002-3967-4374"
- family-names: "M. G. dos Santos"
given-names: "Ataíde"
orcid: "https://orcid.org/0000-0003-2725-1734"
- family-names: "Park"
given-names: "Shinkyu"
orcid: "https://orcid.org/0000-0002-8643-404X"
title: "Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation"
year: 2025
journal: "IEEE Robotics and Automation Letters"
volume: 10
issue: 9
start: 9216
end: 9223
doi: "10.1109/LRA.2025.3592080"
GitHub Events
Total
- Watch event: 1
- Push event: 5
- Public event: 1
Last Year
- Watch event: 1
- Push event: 5
- Public event: 1
Dependencies
- numpy *