benchmarking-lunar-lander

Submission for CM30225 (Reinforcement Learning) - Benchmarking RL Methods in Lunar Lander

https://github.com/olliejonas/benchmarking-lunar-lander

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.6%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

Submission for CM30225 (Reinforcement Learning) - Benchmarking RL Methods in Lunar Lander

Basic Info
  • Host: GitHub
  • Owner: OllieJonas
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 103 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 3 years ago · Last pushed about 3 years ago
Metadata Files
Readme License Citation

README.md

PythonVersion GymVersion License

Benchmarking Different RL Methods on Lunar Lander

Submission for CM30225 (Reinforcement Learning) at the University of Bath, written by Fraser Dwyer, Helen Harmer, Ollie Jonas, and Yatin Tanna.

This project aims to provide a framework for multiple different RL methods, and provide some utilities that are common amongst all of them.

A more detailed description of the project (including the config file, creating an agent and the project's structure) can be found in the docs/ directory.

Features

Brief outline of the features provided:

  • Automatically provides runner code with a replay buffer (with conversion to PyTorch tensors)
  • Output (raw data & charts and logs) of overall summary of project (cumulative reward, average reward, no timesteps) per episode
  • Output (raw data & charts) of individual rewards for each timestep at specified episodes
  • Output recordings of specified episodes
  • Saving of checkpoints for neural networks at specified intervals
  • Loading of neural network parameters at startup (either from absolute path, relative path or from latest run)
  • Swaps between continuous and discrete action spaces of LunarLander at runtime
  • Provides easy-to-read configuration file, which dynamically loads a section for each agent, to allow specifying of different hyper-parameters

Limitations

A brief outline of either things this program can't do / things you really have to fight the program to achieve (that we wish it could do)

  • Log stdout / stderr to an output file (it only logs what we log, not what gym logs)
  • Multiple runner implementations (we use a different one for SARSA, but it's very ugly code)
  • Save episodes based on some criteria that's found in run-time (for example: DQN had some runs which took tens of thousands of time-steps to complete, but we have no way of specifying to save recordings of those episodes - you have to specify which episodes to save at compile-time)

Installation Guide

Console (Linux / Mac)

For Linux / Mac, it's very easy to do:

  1. Navigate to the root directory for this project
  2. Run pip3 install -r requirements.txt
  3. Run pip3 install swig
  4. Run pip3 install gym[All] or pip3 install gym[Box2D]
  5. Set your PYTHONPATH environment variable to rlcw

Windows

For Windows, you can run this program using Docker.

Installation Guide (Windows)

  1. Install Docker. You can find the link for this here: Install Docker
  2. For Windows, you're going to need to use WSL 2 Linux Kernel (A Linux Kernel for Windows), and install the Ubuntu distro for WSL. This guide might be helpful: Install WS2. Also note that Docker Desktop will automatically start when you start your PC. If you want to disable this, do the following:
    1. Open Task Manager
    2. Go to the Startup Tab
    3. Find Docker Desktop, right click and click Disable.

Running the Program

For UNIX-based systems, you just need to run the program like any old python program: python3 -m main.

For Windows, a run.bat file has been included for convenience sake in the root directory. This builds and runs the image, and then collects any results from the container.

Owner

  • Name: Ollie
  • Login: OllieJonas
  • Kind: user
  • Location: London, United Kingdom

University of Bath Computer Science Student

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Jonas"
  given-names: "Ollie"
- family-names: "Dwyer"
  given-names: "Fraser"
- family-names: "Harmer"
  given-names: "Helen"
- family-names: "Tanna"
  given-names: "Yatin"
title: "Benchmarking Different RL Methods on Lunar Lander"
version: 1.0
date-released: 2023-Jan-09
url: "https://github.com/OllieJonas/ReinforcementLearningCW"

GitHub Events

Total
Last Year

Dependencies

Dockerfile docker
  • python 3.9 build
requirements.txt pypi
  • PyYAML *
  • Pympler *
  • gym *
  • ipython ==8.7.0
  • matplotlib *
  • moviepy *
  • numpy ==1.23.5
  • pyprof2calltree *
  • scipy *
  • swig *
  • torch *
  • tqdm *