dfac

[ICML 2021] DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning

https://github.com/j3soon/dfac

Science Score: 41.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary

Keywords

dfac icml-2021 multi-agent-reinforcement-learning reinforcement-learning smac starcraft2

Last synced: 9 months ago · JSON representation ·

Repository

[ICML 2021] DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning

Basic Info

Host: GitHub
Owner: j3soon
License: apache-2.0
Language: Python
Default Branch: master
Homepage: https://j3soon.github.io/dfac
Size: 1.1 MB

Statistics

Stars: 32
Watchers: 1
Forks: 3
Open Issues: 0
Releases: 0

Topics

dfac icml-2021 multi-agent-reinforcement-learning reinforcement-learning smac starcraft2

Created almost 5 years ago · Last pushed about 3 years ago

Metadata Files

Readme License Citation

Distributional Value Function Factorization (DFAC) Framework

This is the official repository that contain the source code for the DFAC paper:

[ICML 2021] DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning

If you have any question regarding the paper or code, ask by submitting an issue.

An extended version of the paper has been published in the Journal of Machine Learning Research (JMLR) 2023. Please refer to the dfac-extended repository for more information.

Gameplay Video Preview

Learned policy of DDN on Super Hard & Ultra Hard maps:

https://youtu.be/MLdqyyPcv9U

Installation

Install docker, nvidia-docker, and nvidia-container-runtime. You can refer to this document for installation instructions.

Execute the following commands in your Linux terminal to build the docker image:

```sh

Clone the repository

git clone https://github.com/j3soon/dfac.git cd dfac

Download StarCraft 2.4.10

wget http://blzdistsc2-a.akamaihd.net/Linux/SC2.4.10.zip

Extract the files to StarCraftII directory

unzip -P iagreetotheeula SC2.4.10.zip mv SC2.4.10.zip ..

Build docker image

docker build . --build-arg DOCKER_BASE=nvcr.io/nvidia/tensorflow:19.12-tf1-py3 -t j3soon/dfac:1.0 ```

Launch a docker container:

sh docker run --gpus all \ --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \ --rm \ -it \ -v "$(pwd)"/pymarl:/root/pymarl \ -v "$(pwd)"/results:/results \ -e DISPLAY=$DISPLAY \ --device /dev/snd \ j3soon/dfac:1.0 /bin/bash

Run the following command in the docker container for quick testing:

sh cd /root/pymarl python3 src/main.py --config=ddn --env-config=sc2 with env_args.map_name=3m t_max=50000

After finish training, exit the container by exit, the container will be automatically deleted thanks to the --rm flag.

The results are stored in ./results.

We chose to release the code based on docker for better reproducibility and the ease of use. For installing directly or running the code in virtualenv or conda, you may want to refer to the Dockerfile. If you still have trouble setting up the environment, open an issue and describe your encountered issue.

Reproducing

The following is the list of commands used for the experiments in the paper:

```sh

3s5zvs3s6z

python3 src/main.py --config=iql --env-config=sc2 with envargs.mapname=3s5zvs3s6z rnnhiddendim=512 python3 src/main.py --config=vdn --env-config=sc2 with envargs.mapname=3s5zvs3s6z rnnhiddendim=128 python3 src/main.py --config=qmix --env-config=sc2 with envargs.mapname=3s5zvs3s6z rnnhiddendim=128 python3 src/main.py --config=diql --env-config=sc2 with envargs.mapname=3s5zvs3s6z rnnhiddendim=256 python3 src/main.py --config=ddn --env-config=sc2 with envargs.mapname=3s5zvs3s6z rnnhiddendim=512 python3 src/main.py --config=dmix --env-config=sc2 with envargs.mapname=3s5zvs3s6z rnnhiddendim=256

6hvs8z

python3 src/main.py --config=iql --env-config=sc2 with envargs.mapname=6hvs8z rnnhiddendim=128 python3 src/main.py --config=vdn --env-config=sc2 with envargs.mapname=6hvs8z rnnhiddendim=128 python3 src/main.py --config=qmix --env-config=sc2 with envargs.mapname=6hvs8z rnnhiddendim=256 python3 src/main.py --config=diql --env-config=sc2 with envargs.mapname=6hvs8z rnnhiddendim=512 python3 src/main.py --config=ddn --env-config=sc2 with envargs.mapname=6hvs8z rnnhiddendim=512 python3 src/main.py --config=dmix --env-config=sc2 with envargs.mapname=6hvs8z rnnhiddendim=256

MMM2

python3 src/main.py --config=iql --env-config=sc2 with envargs.mapname=MMM2 rnnhiddendim=256 python3 src/main.py --config=vdn --env-config=sc2 with envargs.mapname=MMM2 rnnhiddendim=64 python3 src/main.py --config=qmix --env-config=sc2 with envargs.mapname=MMM2 rnnhiddendim=64 python3 src/main.py --config=diql --env-config=sc2 with envargs.mapname=MMM2 rnnhiddendim=512 python3 src/main.py --config=ddn --env-config=sc2 with envargs.mapname=MMM2 rnnhiddendim=512 python3 src/main.py --config=dmix --env-config=sc2 with envargs.mapname=MMM2 rnnhiddendim=256

27mvs30m

python3 src/main.py --config=iql --env-config=sc2 with envargs.mapname=27mvs30m rnnhiddendim=256 python3 src/main.py --config=vdn --env-config=sc2 with envargs.mapname=27mvs30m rnnhiddendim=64 python3 src/main.py --config=qmix --env-config=sc2 with envargs.mapname=27mvs30m rnnhiddendim=64 python3 src/main.py --config=diql --env-config=sc2 with envargs.mapname=27mvs30m rnnhiddendim=512 python3 src/main.py --config=ddn --env-config=sc2 with envargs.mapname=27mvs30m rnnhiddendim=128 python3 src/main.py --config=dmix --env-config=sc2 with envargs.mapname=27mvs30m rnnhiddendim=128

corridor

python3 src/main.py --config=iql --env-config=sc2 with envargs.mapname=corridor rnnhiddendim=256 python3 src/main.py --config=vdn --env-config=sc2 with envargs.mapname=corridor rnnhiddendim=128 python3 src/main.py --config=qmix --env-config=sc2 with envargs.mapname=corridor rnnhiddendim=256 python3 src/main.py --config=diql --env-config=sc2 with envargs.mapname=corridor rnnhiddendim=512 python3 src/main.py --config=ddn --env-config=sc2 with envargs.mapname=corridor rnnhiddendim=128 python3 src/main.py --config=dmix --env-config=sc2 with envargs.mapname=corridor rnnhiddendim=64 ```

If you want to modify the algorithm, you can modify the files in ./pymarl directly, without rebuilding the docker image or restarting the docker container.

Compare Baseline code with DFAC code

The code of DFAC is organized with minimum changes based on oxwhirl/pymarl for readibility. You may want to compare the baselines with their DFAC variants with the following commands:

```sh

Configs

diff pymarl/src/config/algs/iql.yaml pymarl/src/config/algs/diql.yaml diff pymarl/src/config/algs/vdn.yaml pymarl/src/config/algs/ddn.yaml diff pymarl/src/config/algs/qmix.yaml pymarl/src/config/algs/dmix.yaml

Agent

diff pymarl/src/learners/qlearner.py pymarl/src/learners/iqnlearner.py diff pymarl/src/modules/agents/rnnagent.py pymarl/src/modules/agents/iqnrnn_agent.py

Mixer

diff pymarl/src/modules/mixers/vdn.py pymarl/src/modules/mixers/ddn.py diff pymarl/src/modules/mixers/qmix.py pymarl/src/modules/mixers/dmix.py ```

For comparing all modifications based on all used packages, refer to this comparison link of all modifications.

Developing new Algorithms

Updaing Packages

Since this repository is frozen in old commits for reproducibility, you may want to use the newest packages:

For common baselines, you may want to refer to the following package which collected a bunch of baselines:

hijkzzz/pymarl2

Inspect the Training Progress

You can inspect the training progress in real-time by the following command:

sh tensorboard --logdir=./results

Citing DFAC

If you used the provided code or want to cite our work, please cite the DFAC paper.

BibTex format:

@InProceedings{sun21dfac, title = {{DFAC} Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning}, author = {Sun, Wei-Fang and Lee, Cheng-Kuang and Lee, Chun-Yi}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {9945--9954}, year = {2021}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/sun21c/sun21c.pdf}, url = {http://proceedings.mlr.press/v139/sun21c.html}, }

You will also want to cite the SMAC paper for providing the benchmark used in the paper.

License

To maintain reproducibility, we freezed the following packages with the commit used in the paper. The licenses of these packages are listed below:

oxwhirl/sacred (at commit 13f04ad) is released under the MIT License
oxwhirl/smac (at commit 8d2c42b) is released under the MIT License
oxwhirl/pymarl (at commit dd92936) is released under the Apache-2.0 License

Further changes based on the packages above are release under the Apache-2.0 License.

Owner

Name: Johnson Sun
Login: j3soon
Kind: user
Location: Taiwan
Company: @Elsa-Lab @NVIDIA

Website: https://j3soon.github.io/
Twitter: j3soon
Repositories: 129
Profile: https://github.com/j3soon

Citation (CITATION.bib)

@InProceedings{sun21dfac,
  title = 	 {{DFAC} Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning},
  author =       {Sun, Wei-Fang and Lee, Cheng-Kuang and Lee, Chun-Yi},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {9945--9954},
  year = 	 {2021},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/sun21c/sun21c.pdf},
  url = 	 {http://proceedings.mlr.press/v139/sun21c.html},
}

GitHub Events

Total

Watch event: 3

Last Year

Watch event: 3

Committers

Last synced: 11 months ago

All Time

Total Commits: 8
Total Committers: 1
Avg Commits per committer: 8.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Johnson	j**n@m**t	8

Committer Domains (Top 20 + Academic)

msa.hinet.net: 1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 2
Total pull requests: 0
Average time to close issues: 9 days
Average time to close pull requests: N/A
Total issue authors: 2
Total pull request authors: 0
Average comments per issue: 1.5
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

ybh4798 (1)
LiP2301 (1)

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

pymarl/requirements.txt pypi

Pillow ==5.3.0
PyYAML ==3.13
absl-py ==0.5.0
atomicwrites ==1.2.1
attrs ==18.2.0
certifi ==2018.8.24
chardet ==3.0.4
cycler ==0.10.0
docopt ==0.6.2
enum34 ==1.1.6
future ==0.16.0
idna ==2.7
imageio ==2.4.1
jsonpickle ==0.9.6
kiwisolver ==1.0.1
matplotlib ==3.0.0
mock ==2.0.0
more-itertools ==4.3.0
mpyq ==0.2.5
munch ==2.3.2
numpy ==1.15.2
pathlib2 ==2.3.2
pbr ==4.3.0
pluggy ==0.7.1
portpicker ==1.2.0
probscale ==0.2.3
protobuf ==3.6.1
py ==1.6.0
pygame ==1.9.4
pyparsing ==2.2.2
pysc2 ==3.0.0
pytest ==3.8.2
python-dateutil ==2.7.3
requests ==2.19.1
s2clientprotocol ==4.10.1.75800.0
sacred ==0.7.2
scipy ==1.1.0
six ==1.11.0
sk-video ==1.1.10
snakeviz ==1.0.0
tensorboard-logger ==0.1.0
torch ==0.4.1
torchvision ==0.2.1
tornado ==5.1.1
urllib3 ==1.23
websocket-client ==0.53.0
whichcraft ==0.5.2
wrapt ==1.10.11

sacred/dev-requirements.txt pypi

GitPython ==2.1.1 development
Mako ==1.0.6 development
MarkupSafe ==0.23 development
PyYAML ==3.12 development
SQLAlchemy ==1.1.4 development
docopt ==0.6.2 development
gitdb2 ==2.0.0 development
hashfs ==0.7.0 development
jsonpickle ==0.9.3 development
mock ==2.0.0 development
mongomock ==3.7.0 development
munch ==2.0.4 development
numpy ==1.11.3 development
pandas ==0.19.2 development
pbr ==1.10.0 development
py ==1.4.32 development
pymongo ==3.4.0 development
pytest ==3.0.5 development
python-dateutil ==2.6.0 development
pytz ==2016.10 development
scandir ==1.4 development
sentinels ==1.0.0 development
smmap2 ==2.0.1 development
tinydb ==3.2.1 development
tinydb-serialization ==1.0.3 development
wrapt ==1.10.8 development

sacred/requirements.txt pypi

docopt ==0.6.2
jsonpickle ==0.9.3
mock ==2.0.0
munch ==2.0.4
pbr ==1.10.0
py ==1.4.32
pytest ==3.0.5
wrapt ==1.10.8

sacred/setup.py pypi

docopt >=0.3,
jsonpickle >=0.7.2,
munch >=2.0.2,
wrapt >=1.0,

smac/setup.py pypi

absl-py >=0.1.0
numpy >=1.10
pysc2 >=3.0.0
s2clientprotocol >=4.10.1.75800.0

Dockerfile docker

$DOCKER_BASE latest build

pymarl/docker/Dockerfile docker

nvidia/cuda 9.2-cudnn7-devel-ubuntu16.04 build

dfac

Science Score: 41.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Distributional Value Function Factorization (DFAC) Framework

Gameplay Video Preview

Installation

Clone the repository

Download StarCraft 2.4.10

Extract the files to StarCraftII directory

Build docker image

Reproducing

3s5zvs3s6z

6hvs8z

MMM2

27mvs30m

corridor

Compare Baseline code with DFAC code

Configs

Agent

Mixer

Developing new Algorithms

Updaing Packages

Inspect the Training Progress

Citing DFAC

License

Owner

Citation (CITATION.bib)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies