benchmarking-lunar-lander
Submission for CM30225 (Reinforcement Learning) - Benchmarking RL Methods in Lunar Lander
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.6%) to scientific vocabulary
Repository
Submission for CM30225 (Reinforcement Learning) - Benchmarking RL Methods in Lunar Lander
Basic Info
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Benchmarking Different RL Methods on Lunar Lander
Submission for CM30225 (Reinforcement Learning) at the University of Bath, written by Fraser Dwyer, Helen Harmer, Ollie Jonas, and Yatin Tanna.
This project aims to provide a framework for multiple different RL methods, and provide some utilities that are common amongst all of them.
A more detailed description of the project (including the config file, creating an agent and the project's structure) can be found in the docs/ directory.
Features
Brief outline of the features provided:
- Automatically provides runner code with a replay buffer (with conversion to PyTorch tensors)
- Output (raw data & charts and logs) of overall summary of project (cumulative reward, average reward, no timesteps) per episode
- Output (raw data & charts) of individual rewards for each timestep at specified episodes
- Output recordings of specified episodes
- Saving of checkpoints for neural networks at specified intervals
- Loading of neural network parameters at startup (either from absolute path, relative path or from latest run)
- Swaps between continuous and discrete action spaces of LunarLander at runtime
- Provides easy-to-read configuration file, which dynamically loads a section for each agent, to allow specifying of different hyper-parameters
Limitations
A brief outline of either things this program can't do / things you really have to fight the program to achieve (that we wish it could do)
- Log stdout / stderr to an output file (it only logs what we log, not what gym logs)
- Multiple runner implementations (we use a different one for SARSA, but it's very ugly code)
- Save episodes based on some criteria that's found in run-time (for example: DQN had some runs which took tens of thousands of time-steps to complete, but we have no way of specifying to save recordings of those episodes - you have to specify which episodes to save at compile-time)
Installation Guide
Console (Linux / Mac)
For Linux / Mac, it's very easy to do:
- Navigate to the root directory for this project
- Run pip3 install -r requirements.txt
- Run pip3 install swig
- Run pip3 install gym[All] or pip3 install gym[Box2D]
- Set your PYTHONPATH environment variable to rlcw
Windows
For Windows, you can run this program using Docker.
Installation Guide (Windows)
- Install Docker. You can find the link for this here: Install Docker
- For Windows, you're going to need to use WSL 2 Linux Kernel (A Linux Kernel for Windows), and install the Ubuntu distro for WSL. This guide might be helpful: Install WS2. Also note that Docker Desktop will automatically start when you start your PC. If you want to disable this, do the following:
- Open Task Manager
- Go to the Startup Tab
- Find Docker Desktop, right click and click Disable.
Running the Program
For UNIX-based systems, you just need to run the program like any old python program: python3 -m main.
For Windows, a run.bat file has been included for convenience sake in the root directory. This builds and runs the image, and then collects any results from the container.
Owner
- Name: Ollie
- Login: OllieJonas
- Kind: user
- Location: London, United Kingdom
- Repositories: 2
- Profile: https://github.com/OllieJonas
University of Bath Computer Science Student
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Jonas" given-names: "Ollie" - family-names: "Dwyer" given-names: "Fraser" - family-names: "Harmer" given-names: "Helen" - family-names: "Tanna" given-names: "Yatin" title: "Benchmarking Different RL Methods on Lunar Lander" version: 1.0 date-released: 2023-Jan-09 url: "https://github.com/OllieJonas/ReinforcementLearningCW"
GitHub Events
Total
Last Year
Dependencies
- python 3.9 build
- PyYAML *
- Pympler *
- gym *
- ipython ==8.7.0
- matplotlib *
- moviepy *
- numpy ==1.23.5
- pyprof2calltree *
- scipy *
- swig *
- torch *
- tqdm *