pommerlearn

Learning-based MCTS in the Pommerman Environment

https://github.com/jw3il/pommerlearn

Keywords

monte-carlo-tree-search multi-agent-systems pommerman reinforcement-learning supervised-learning

Last synced: 6 months ago · JSON representation ·

Repository

Learning-based MCTS in the Pommerman Environment

Basic Info

Host: GitHub
Owner: jw3il
License: gpl-3.0
Language: Python
Default Branch: master
Homepage:
Size: 532 KB

Statistics

Stars: 2
Watchers: 2
Forks: 2
Open Issues: 0
Releases: 1

Topics

monte-carlo-tree-search multi-agent-systems pommerman reinforcement-learning supervised-learning

Created almost 3 years ago · Last pushed about 2 years ago

Metadata Files

Readme License Citation

PommerLearn: Learning-based MCTS in the Pommerman Environment

This repository provides an implementation of learning-based Monte-Carlo Tree Search variants in the Pommerman environment. Our approaches leverage opponent models (planning agents) to transform the multiplayer game into single- and two-player games depending on the provided settings.

Docker

The simplest way to get started and execute runs is to build a docker image and run it as a container.

Available backends: - TensorRT (NVIDIA GPU required): Tested with TensorRT 8.0.1 and PyTorch 1.9.0.

Prerequisites

To use NVIDIA GPUs in docker containers, you have to install docker and nvidia-docker2. Have a look at the installation guide https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html.

Build Scripts

We provide small scripts to facilitate building the image and running experiments.

Build the image $ bash docker/build.sh This automatically caches the dependencies. If you run it again, only the code is rebuilt. If you want to rebuild the whole image, just call bash docker/build.sh --no-cache.
Specify where you want to store the data generated by the experiments as environment variable $POMMER_DATA_DIR. You can export POMMER_DATA_DIR=/some/dir or just add POMMER_DATA_DIR=/some/dir as a prefix to the command in the following step.
Create a container and run the training loop (replace --help with the arguments of your choice) $ bash docker/run.sh --help
- Note that --dir and --exec are already specified correctly by docker/run.sh.
- All GPUs are visible in the container and gpu 0 is used by default. You can specify the gpu to be used like --gpu 4.

Manual Docker Build

Of course, you can also build and run the image manually. Have a closer look at the scripts from the previous section for details.

Additional notes: * You can limit the gpu access of a container like --gpus device=4. However, PommerLearn has a --gpu argument that can be used instead. * Warning: If you use rootless docker, the container will probably run out of memory. Adding --ipc=host or --shm-size=32g to the docker run command helps. This is also done by default in docker/run.sh.

Experiments

Search Approaches

Generate an SL dataset with 1 million samples with

$POMMER_EXEC --mode=ffa_sl --max-games=-1 --chunk-size=1000 --chunk-count=1000 --log --file-prefix=./1M_simple where $POMMER_EXEC can be your PommerLearn executable or MODE=exec bash docker/run.sh
Train SL model: Run pommerlearn/training/train_cnn.py with the following modified arguments (see bottom of the file)

"dataset_path": "1M_simple_0.zr", "test_size": 0.01, "output_dir": "./model-sl" and save it as $POMMER_DATA_DIR/model-sl
Generate a dummy model by running pommerlearn/debug/create_dummy_model.py and save it as $POMMER_DATA_DIR/model-dummy
You can now perform search experiments with both models. Use POMMER_1VS1=false MODE=exec bash run.sh for the single-player search and POMMER_1VS1=true MODE=exec bash run.shgit for the two-player search.
To reproduce our results, you can generate 5 sl and dummy models labeled with the respective suffix -0 to -4. Navigate into the docker directory and run the search experiments with

./docker $ bash search_experiments.sh

The results will be recorded in a single csv file.

Reinforcement Learning

Navigate into the docker directory and run the rl experiments with

./docker $ bash rl_experiments.sh

This will create a new directory in your working directory to store the training logs. You will find the results in your $POMMER_DATA_DIR/archive and the tensorboard runs in $POMMER_DATA_DIR/runs.

Team Mode Experiments

To perform experiments in the team mode, you can collect samples with the option --mode=team_sl and otherwise proceed like in the FFA mode, e.g.

```
$POMMER_EXEC --mode=team_sl --max-games=-1 --chunk-size=1000 --chunk-count=1000 --log --file-prefix=./1M_simple_team
```

You can then run pommerlearn/training/train_cnn.py on the generated data set, this will automatically use the value targets for the team mode due to the meta information in the data set.

Development

Manual Installation of Dependencies

For the python side:

python 3.7 and pip

It is recommend to use virtual environments. This guide will use Anaconda. Create an environment named pommer with

$ conda create -n pommer python=3.7

For the C++ side:

Essential build tools: gcc, make, cmake

$ sudo apt install build-essential cmake
The dependencies z5, xtensor, boost and json by nlohmann can directly be installed with conda in the pommer environment:

(pommer) $ conda install -c conda-forge z5py xtensor boost nlohmann_json blosc
Blaze needs to be installed manually. Note that it can be unpacked anywhere, it does not have to be /usr/local. For further information, you can refer to the installation guide or the Dockerfiles in this repository.

cmake -DCMAKE_INSTALL_PREFIX=/usr/local/ sudo make install export BLAZE_PATH=/usr/local/include/
Manual installation of TensorRT (not Torch-TensorRT), including CUDA and cuDNN. Please refer to the installation guide by NVIDIA https://developer.nvidia.com/tensorrt-getting-started.

Clone Repository

This repository depends on submodules. Clone it and initialize all submodules with

$ git clone git@gitlab.com:jweil/PommerLearn.git && \ $ cd PommerLearn && \ $ git submodule update --init

Build Instructions

The current version requires you to set the env variables

* `CONDA_ENV_PATH`: path of your conda environment (e.g. `~/conda/envs/pommer`)
* `BLAZE_PATH`: blaze installation path (e.g. `/usr/local/include`)
* `CUDA_PATH`: cuda installation path (e.g. `/usr/local/cuda`)
* `TENSORRT_PATH` (when using the CrazyAra TensorRT backend, e.g. `/usr/src/tensorrt`)
* [`Torch_DIR`] (when using the CrazyAra Torch backend, currently untested)

Build the C++ environment with the provided CMakeLists.txt. To use TensorRT >= 8 (recommended), you have to specify -DUSE_TENSORRT8=ON.

/PommerLearn/build $ cmake -DCMAKE_BUILD_TYPE=Release -DUSE_TENSORRT8=ON -DCMAKE_CXX_COMPILER="$(which g++)" .. /PommerLearn/build $ make VERBOSE=1 all -j8

Run Instructions

Optional: You can install PyTorch 1.9.0 with GPU support via

conda install -y pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=11.1 -c conda-forge -c pytorch

The remaining python runtime dependencies can be installed with (pommer) $ pip install -r requirements.txt

Before starting the RL loop, you can check whether everything is set up correctly by creating a dummy model and loading it in the cpp executable:

(pommer) /PommerLearn/build $ python ../pommerlearn/debug/create_dummy_model.py (pommer) /PommerLearn/build $ ./PommerLearn --mode=ffa_mcts --model=./model/onnx

You can then start training by running

(pommer) /PommerLearn/build $ python ../pommerlearn/training/rl_loop.py

Troubleshooting

Prerequisites and Building * Make sure that you've pulled all submodules recursively * In older versions of TensorRT, you have to manually comment out using namespace sample; in deps/CrazyAra/engine/src/nn/tensorrtapi.cpp * We experienced issues with std::filesystem being undefined when using GCC 7.5.0. We recommend to update to more recent versions, e.g. GCC 11.2.0.

Running * For runtime issues like libstdc++.so.6: version 'GLIBCXX_3.4.30' not found, try loading your system libraries with export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu/. On some systems, ctypes somehow uses a different libstdc++ from the conda environment instead of the correct lib path. As a last resort, you can back up the original library mv /conda-lib-path/libstdc++.so.6 /conda-lib-path/libstdc++.so.6.old and then create a symbolic link ln -s /usr/lib/x86_64-linux-gnu/libstdc++.so.6 /conda-lib-path/libstdc++.so.6. * If you encounter errors like ModuleNotFoundError: No module named 'training', set your PYTHONPATH to the pommerlearn directory. For example, export PYTHONPATH=/PommerLearn/pommerlearn. * When loading tensorboard runs, you can get errors like Error: tonic::transport::Error(Transport, hyper::Error(Accept, Os { code: 24, kind: Other, message: "Too many open files" })). The argument --load_fast=false might help.

Performance Profiling

You can install the plotting utility for gprof: https://github.com/jrfonseca/gprof2dot

Activate the CMake option USE_PROFILING in CMakeLists.txt and rebuild. Run the executable and generate the plot: bash ./PommerLearn --mode ffa_mcts --max_games 10 gprof PommerLearn | gprof2dot | dot -Tpng -o profile.png

Publications

If you find this repository helpful, please consider citing our paper

@inproceedings{weil2023knowYourEnemy, author={Weil, Jannis and Czech, Johannes and Meuser, Tobias and Kersting, Kristian}, title={{Know your Enemy: Investigating Monte-Carlo Tree Search with Opponent Models in Pommerman}}, booktitle={Proceedings of the Adaptive and Learning Agents Workshop (ALA) at AAMAS 2023}, url={https://alaworkshop2023.github.io/}, year={2023} }

Owner

Name: Jannis Weil
Login: jw3il
Kind: user
Location: Darmstadt, Germany
Company: TU Darmstadt

Repositories: 2
Profile: https://github.com/jw3il

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: >-
  PommerLearn: Learning-based MCTS in the Pommerman
  Environment
message: >-
  If you use this repository for your research, please cite
  the following paper.
type: software
authors:
  - given-names: Jannis
    family-names: Weil
  - given-names: Johannes
    family-names: Czech
  - given-names: Jonas
    family-names: Ringsdorf
license: GPL-3.0
preferred-citation:
    type: conference-paper
    title: >-
      Know your Enemy: Investigating Monte-Carlo Tree Search with Opponent
      Models in Pommerman
    authors:
      - family-names: Weil
        given-names: Jannis
      - family-names: Czech
        given-names: Johannes
      - family-names: Meuser
        given-names: Tobias
      - family-names: Kersting
        given-names: Kristian
    collection-title: >-
      Proceedings of the Adaptive and Learning Agents Workshop (ALA) at AAMAS
      2023
    year: 2023

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

pommerlearn

Science Score: 44.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

PommerLearn: Learning-based MCTS in the Pommerman Environment

Docker

Prerequisites

Build Scripts

Manual Docker Build

Experiments

Search Approaches

Reinforcement Learning

Team Mode Experiments

Development

Manual Installation of Dependencies

Clone Repository

Build Instructions

Run Instructions

Troubleshooting

Performance Profiling

Publications

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year