marl_mrt2a

https://github.com/lcdbezerra/marl_mrt2a

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org, springer.com, ieee.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.8%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: lcdbezerra
License: apache-2.0
Language: Python
Default Branch: main
Size: 3.01 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed 11 months ago

Metadata Files

Readme License Citation

Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation

This repository contains the code and simulation environments used in the paper:

"Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation"
Lucas C. D. Bezerra, Ataíde M. G. dos Santos, and Shinkyu Park
Accepted: IEEE Robotics and Automation Letters, 2025

TL;DR:

This repository contains:

GyMRT²A Environment: a Gym environment for discrete-space, discrete-time MRTA with multi-robot tasks.
MARL-MRT²A: a MAPPO-based algorithm that enables learning of decentralized, low-communication, generalizable task allocation policies for MRT²A; this implementation builds upon the EPyMARL codebase.
PCFA Baseline: our implementation of the decentralized market-based approach PCFA.

To get started, please see the Installation section and run the provided examples.

Overview

We propose a decentralized, learning-based framework for dynamic coalition formation in Multi-Robot Task Allocation (MRTA) under partial observability. Our method extends MAPPO with multiple integrated components that allow robots to coordinate and revise task assignments in dynamic, partially observable environments.

Key Components

Spatial Action Maps: Agents select task locations in spatial coordinates, enabling long-horizon task planning.
Robot Motion Planning: Each robot computes a collision-free A* path to the selected task.
Intention Sharing: Robots share decayed path-based intention maps with nearby agents to support coordination.
Custom Policy Architecture: We propose using a U-Net as the policy architecture, but our code supports custom architecture (that can be implemented as a torch.nn.Sequential; see nn_utils.py for modules that are currently available)

Environment

We implement our experiments in a custom Gym environment called GyMRT²A, which simulates:

Grid-world task allocation with dynamic task spawns (Bernoulli or instant respawn)
Partial observability (limited view and communication ranges)
Multi-level tasks requiring varying coalition sizes
Motion planning with obstacles and other agents

Repository Structure

marl_mrt2a/ ├── marl_mrt2a/ │ ├── env/ # GyMRT²A environment │ ├── PCFA/ # Baseline implementation │ ├── marl/ # Our method's implementation │ └── examples/ # Reproducible experiments │ └── main_comparison/ # Comparison with baseline and ablation studies ├── LICENSE └── README.md

Installation

Prerequisites

Python 3.10 or higher
PyTorch
NumPy
OpenAI Gym

Setup

Clone the repository: bash git clone https://github.com/lcdbezerra/marl_mrt2a.git cd marl_mrt2a
Create a conda environment: bash conda create -n marl_mrt2a python=3.10 -y conda activate marl_mrt2a conda install pip -y
Install the base environment and the baseline (development mode) bash cd marl_mrt2a/env pip install -e . pip install -U pygame --user conda install -c conda-forge libstdcxx-ng -y cd ../PCFA pip install -e . cd ../
Setup Weights & Biases for experiment tracking bash pip install wandb wandb login
Install MARL dependencies: bash cd marl pip install -r requirements.txt

Experiments

Experiments are reproducible through the examples in the examples/ directory:

Main Comparison (`examples/main_comparison/`)

Compare the proposed method against baseline approaches: - Traditional task allocation methods - Standard MAPPO - Other multi-agent learning approaches

Running Experiments

```bash

Run main comparison experiments

python examples/maincomparison/runcomparison.py <!--bash

Run main comparison experiments

python examples/maincomparison/runcomparison.py

Run scalability experiments

python examples/scalability/run_scalability.py

Run generalizability experiments

python examples/generalizability/run_generalizability.py

Watch trained agents

python examples/watchepisode/watchepisode.py ``` -->

Citation

If you use this code, please cite:

bibtex @article{bezerra2025learningdcfmrta, author={Lucas C. D. Bezerra and Ataíde M. G. dos Santos and Shinkyu Park}, journal={IEEE Robotics and Automation Letters}, title={Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation}, year={2025}, volume={10}, number={9}, pages={9216-9223}, doi={10.1109/LRA.2025.3592080}}

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

For information about derivative works and third-party components, see the NOTICE file.

Third-Party Code

This repository includes code under the Apache License 2.0: - Multi-agent reinforcement learning framework based on EPyMARL and PyMARL - astar.py – A* pathfinding implementation from Red Blob Games, Copyright 2014 Red Blob Games, licensed under Apache License 2.0. Adapted by Lucas C. D. Bezerra.

Owner

Name: Lucas Bezerra
Login: lcdbezerra
Kind: user
Company: KAUST

Website: lucascamara.com
Twitter: lcdbezerra
Repositories: 1
Profile: https://github.com/lcdbezerra

PhD Student @ KAUST. Reinforcement Learning researcher.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: "C. D. Bezerra"
    given-names: "Lucas"
    orcid: "https://orcid.org/0000-0002-3967-4374"
  - family-names: "M. G. dos Santos"
    given-names: "Ataíde"
    orcid: "https://orcid.org/0000-0003-2725-1734"
  - family-names: "Park"
    given-names: "Shinkyu"
    orcid: "https://orcid.org/0000-0002-8643-404X"
title: "Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation"
version: "1.0.0"
date-released: 2025-02-22
url: "https://github.com/lcdbezerra/marl_mrt2a"
license: Apache-2.0
preferred-citation:
  type: article
  authors:
    - family-names: "C. D. Bezerra"
      given-names: "Lucas"
      orcid: "https://orcid.org/0000-0002-3967-4374"
    - family-names: "M. G. dos Santos"
      given-names: "Ataíde"
      orcid: "https://orcid.org/0000-0003-2725-1734"
    - family-names: "Park"
      given-names: "Shinkyu"
      orcid: "https://orcid.org/0000-0002-8643-404X"
  title: "Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation"
  year: 2025
  journal: "IEEE Robotics and Automation Letters"
  volume: 10
  issue: 9
  start: 9216
  end: 9223
  doi: "10.1109/LRA.2025.3592080"

GitHub Events

Total

Watch event: 1
Push event: 5
Public event: 1

Last Year

Watch event: 1
Push event: 5
Public event: 1

Dependencies

marl_mrt2a/env/setup.py pypi

numpy *

marl_mrt2a

Science Score: 67.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation

TL;DR:

Overview

Key Components

Environment

Repository Structure

Installation

Prerequisites

Setup

Experiments

Main Comparison (examples/main_comparison/)

Running Experiments

Run main comparison experiments

Run main comparison experiments

Run scalability experiments

Run generalizability experiments

Watch trained agents

Citation

Contributing

License

Third-Party Code

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies

Main Comparison (`examples/main_comparison/`)