benchmarking-lunar-lander

Submission for CM30225 (Reinforcement Learning) - Benchmarking RL Methods in Lunar Lander

https://github.com/olliejonas/benchmarking-lunar-lander

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.6%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Submission for CM30225 (Reinforcement Learning) - Benchmarking RL Methods in Lunar Lander

Basic Info

Host: GitHub
Owner: OllieJonas
License: mit
Language: Python
Default Branch: master
Homepage:
Size: 103 MB

Statistics

Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 3 years ago · Last pushed over 3 years ago

Metadata Files

Readme License Citation

README.md

Benchmarking Different RL Methods on Lunar Lander

Submission for CM30225 (Reinforcement Learning) at the University of Bath, written by Fraser Dwyer, Helen Harmer, Ollie Jonas, and Yatin Tanna.

This project aims to provide a framework for multiple different RL methods, and provide some utilities that are common amongst all of them.

A more detailed description of the project (including the config file, creating an agent and the project's structure) can be found in the docs/ directory.

Features

Brief outline of the features provided:

Automatically provides runner code with a replay buffer (with conversion to PyTorch tensors)
Output (raw data & charts and logs) of overall summary of project (cumulative reward, average reward, no timesteps) per episode
Output (raw data & charts) of individual rewards for each timestep at specified episodes
Output recordings of specified episodes
Saving of checkpoints for neural networks at specified intervals
Loading of neural network parameters at startup (either from absolute path, relative path or from latest run)
Swaps between continuous and discrete action spaces of LunarLander at runtime
Provides easy-to-read configuration file, which dynamically loads a section for each agent, to allow specifying of different hyper-parameters

Limitations

A brief outline of either things this program can't do / things you really have to fight the program to achieve (that we wish it could do)

Log stdout / stderr to an output file (it only logs what we log, not what gym logs)
Multiple runner implementations (we use a different one for SARSA, but it's very ugly code)
Save episodes based on some criteria that's found in run-time (for example: DQN had some runs which took tens of thousands of time-steps to complete, but we have no way of specifying to save recordings of those episodes - you have to specify which episodes to save at compile-time)

Installation Guide

Console (Linux / Mac)

For Linux / Mac, it's very easy to do:

Navigate to the root directory for this project
Run pip3 install -r requirements.txt
Run pip3 install swig
Run pip3 install gym[All] or pip3 install gym[Box2D]
Set your PYTHONPATH environment variable to rlcw

Windows

For Windows, you can run this program using Docker.

Installation Guide (Windows)

Install Docker. You can find the link for this here: Install Docker
For Windows, you're going to need to use WSL 2 Linux Kernel (A Linux Kernel for Windows), and install the Ubuntu distro for WSL. This guide might be helpful: Install WS2. Also note that Docker Desktop will automatically start when you start your PC. If you want to disable this, do the following:
1. Open Task Manager
2. Go to the Startup Tab
3. Find Docker Desktop, right click and click Disable.

Running the Program

For UNIX-based systems, you just need to run the program like any old python program: python3 -m main.

For Windows, a run.bat file has been included for convenience sake in the root directory. This builds and runs the image, and then collects any results from the container.

Owner

Name: Ollie
Login: OllieJonas
Kind: user
Location: London, United Kingdom

Repositories: 2
Profile: https://github.com/OllieJonas

University of Bath Computer Science Student

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Jonas"
  given-names: "Ollie"
- family-names: "Dwyer"
  given-names: "Fraser"
- family-names: "Harmer"
  given-names: "Helen"
- family-names: "Tanna"
  given-names: "Yatin"
title: "Benchmarking Different RL Methods on Lunar Lander"
version: 1.0
date-released: 2023-Jan-09
url: "https://github.com/OllieJonas/ReinforcementLearningCW"

GitHub Events

Total

Last Year

Dependencies

Dockerfile docker

python 3.9 build

requirements.txt pypi

PyYAML *
Pympler *
gym *
ipython ==8.7.0
matplotlib *
moviepy *
numpy ==1.23.5
pyprof2calltree *
scipy *
swig *
torch *
tqdm *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science