Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, springer.com, ieee.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.8%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: lcdbezerra
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 3.01 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 10 months ago · Last pushed 7 months ago
Metadata Files
Readme License Citation

README.md

Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation

License: Apache 2.0 Python 3.10+ IEEE Xplore arXiv

This repository contains the code and simulation environments used in the paper:

"Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation"
Lucas C. D. Bezerra, Ataíde M. G. dos Santos, and Shinkyu Park
Accepted: IEEE Robotics and Automation Letters, 2025

TL;DR:

This repository contains:

  • GyMRT²A Environment: a Gym environment for discrete-space, discrete-time MRTA with multi-robot tasks.
  • MARL-MRT²A: a MAPPO-based algorithm that enables learning of decentralized, low-communication, generalizable task allocation policies for MRT²A; this implementation builds upon the EPyMARL codebase.
  • PCFA Baseline: our implementation of the decentralized market-based approach PCFA.

To get started, please see the Installation section and run the provided examples.

Overview

We propose a decentralized, learning-based framework for dynamic coalition formation in Multi-Robot Task Allocation (MRTA) under partial observability. Our method extends MAPPO with multiple integrated components that allow robots to coordinate and revise task assignments in dynamic, partially observable environments.

Key Components

  • Spatial Action Maps: Agents select task locations in spatial coordinates, enabling long-horizon task planning.
  • Robot Motion Planning: Each robot computes a collision-free A* path to the selected task.
  • Intention Sharing: Robots share decayed path-based intention maps with nearby agents to support coordination.
  • Custom Policy Architecture: We propose using a U-Net as the policy architecture, but our code supports custom architecture (that can be implemented as a torch.nn.Sequential; see nn_utils.py for modules that are currently available)

Environment

We implement our experiments in a custom Gym environment called GyMRT²A, which simulates:

  • Grid-world task allocation with dynamic task spawns (Bernoulli or instant respawn)
  • Partial observability (limited view and communication ranges)
  • Multi-level tasks requiring varying coalition sizes
  • Motion planning with obstacles and other agents

Repository Structure

marl_mrt2a/ ├── marl_mrt2a/ │ ├── env/ # GyMRT²A environment │ ├── PCFA/ # Baseline implementation │ ├── marl/ # Our method's implementation │ └── examples/ # Reproducible experiments │ └── main_comparison/ # Comparison with baseline and ablation studies ├── LICENSE └── README.md

Installation

Prerequisites

  • Python 3.10 or higher
  • PyTorch
  • NumPy
  • OpenAI Gym

Setup

  1. Clone the repository: bash git clone https://github.com/lcdbezerra/marl_mrt2a.git cd marl_mrt2a

  2. Create a conda environment: bash conda create -n marl_mrt2a python=3.10 -y conda activate marl_mrt2a conda install pip -y

  3. Install the base environment and the baseline (development mode) bash cd marl_mrt2a/env pip install -e . pip install -U pygame --user conda install -c conda-forge libstdcxx-ng -y cd ../PCFA pip install -e . cd ../

  4. Setup Weights & Biases for experiment tracking bash pip install wandb wandb login

  5. Install MARL dependencies: bash cd marl pip install -r requirements.txt

Experiments

Experiments are reproducible through the examples in the examples/ directory:

Main Comparison (examples/main_comparison/)

Compare the proposed method against baseline approaches: - Traditional task allocation methods - Standard MAPPO - Other multi-agent learning approaches

Running Experiments

```bash

Run main comparison experiments

python examples/maincomparison/runcomparison.py <!--bash

Run main comparison experiments

python examples/maincomparison/runcomparison.py

Run scalability experiments

python examples/scalability/run_scalability.py

Run generalizability experiments

python examples/generalizability/run_generalizability.py

Watch trained agents

python examples/watchepisode/watchepisode.py ``` -->

Citation

If you use this code, please cite:

bibtex @article{bezerra2025learningdcfmrta, author={Lucas C. D. Bezerra and Ataíde M. G. dos Santos and Shinkyu Park}, journal={IEEE Robotics and Automation Letters}, title={Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation}, year={2025}, volume={10}, number={9}, pages={9216-9223}, doi={10.1109/LRA.2025.3592080}}

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

For information about derivative works and third-party components, see the NOTICE file.

Third-Party Code

This repository includes code under the Apache License 2.0: - Multi-agent reinforcement learning framework based on EPyMARL and PyMARL - astar.py – A* pathfinding implementation from Red Blob Games, Copyright 2014 Red Blob Games, licensed under Apache License 2.0. Adapted by Lucas C. D. Bezerra.

Owner

  • Name: Lucas Bezerra
  • Login: lcdbezerra
  • Kind: user
  • Company: KAUST

PhD Student @ KAUST. Reinforcement Learning researcher.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: "C. D. Bezerra"
    given-names: "Lucas"
    orcid: "https://orcid.org/0000-0002-3967-4374"
  - family-names: "M. G. dos Santos"
    given-names: "Ataíde"
    orcid: "https://orcid.org/0000-0003-2725-1734"
  - family-names: "Park"
    given-names: "Shinkyu"
    orcid: "https://orcid.org/0000-0002-8643-404X"
title: "Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation"
version: "1.0.0"
date-released: 2025-02-22
url: "https://github.com/lcdbezerra/marl_mrt2a"
license: Apache-2.0
preferred-citation:
  type: article
  authors:
    - family-names: "C. D. Bezerra"
      given-names: "Lucas"
      orcid: "https://orcid.org/0000-0002-3967-4374"
    - family-names: "M. G. dos Santos"
      given-names: "Ataíde"
      orcid: "https://orcid.org/0000-0003-2725-1734"
    - family-names: "Park"
      given-names: "Shinkyu"
      orcid: "https://orcid.org/0000-0002-8643-404X"
  title: "Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation"
  year: 2025
  journal: "IEEE Robotics and Automation Letters"
  volume: 10
  issue: 9
  start: 9216
  end: 9223
  doi: "10.1109/LRA.2025.3592080"

GitHub Events

Total
  • Watch event: 1
  • Push event: 5
  • Public event: 1
Last Year
  • Watch event: 1
  • Push event: 5
  • Public event: 1

Dependencies

marl_mrt2a/env/setup.py pypi
  • numpy *