tunnrl_tbm_maintenance

Working repository for the code of the TunnRL TBM project

https://github.com/tunnrl/tunnrl_tbm_maintenance

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 3 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.4%) to scientific vocabulary

Last synced: 9 months ago · JSON representation ·

Repository

Working repository for the code of the TunnRL TBM project

Basic Info

Host: GitHub
Owner: TunnRL
License: mit
Language: Python
Default Branch: main
Size: 1 MB

Statistics

Stars: 1
Watchers: 0
Forks: 0
Open Issues: 5
Releases: 2

Created over 4 years ago · Last pushed almost 2 years ago

Metadata Files

Readme License Citation Codeowners

Tunnel automation with Reinforcement Learning for Cutter Changing - TunnRL-CC

This repository contains code for the ongoing project to use reinforcement learning for optimization of cutter maintenance in hardrock tunnel boring machine excavation.

A first paper on this was published by Erharter and Hansen (2022) which was accompanied by release v1.0.0 of the repository. The framework was then further developed to the current state and release release v2.0.0 accompanies the submission of the paper Towards reinforcement learning - driven TBM cutter changing policies to the journal Automation in Construction. The code development was done by Georg. H. Erharter and Tom F. Hansen and further involved author is Prof. Thomas Marcher (Graz University of Technology).

Erharter GH, Hansen TF. Towards optimized TBM cutter changing policies with reinforcement learning. Geomechanics and Tunnelling 2022;15(5):665–70. DOI: 10.1002/geot.202200032

Two different repo versions

There are implemented two different versions of the repo:

main branch
industry_advanced branch

Industryadvanced _has the same RL-functionality__ and plots as the main_branch but has extended functionality and structure for reporting, reproducability, config and quality control, mainly with functionality from:

Mlflow - for tracking and visualization of all parameters and metrics of all training experiments
Hydra - defines, structures and saves all configuration parameters for all experiments
Pydantic - defines schemas and validations for quality checking config-inputs
Rich - enhanced visualisation of terminal output
Pytest - unity testing of code
Docker - to run experiments in a reproducible way on a High Performance Computer (HPC)

Industry_advanced implements more advanced programming techniques, and includes software principles such as testing, input-checks and code-formatting, all by facilitating easy runs of code using the terminal.

Switch between the repos by choosing the branch at the top left or by clicking: https://github.com/TunnRL/TunnRLTBMmaintenance/tree/industry_advanced.

Directory structure

The code framework depends on a certain folder structure. The main functionality is in the src folder. Here are mainly two types of files:

""_description - scripts to be run
XX_description - functionality provided to run scripts. The set up should be done in the following way:

TunnRL_TBM_maintenance ├── checkpoints - files from training models ├── graphics - saved graphics from running scripts in src ├── install - shell scripts to set up environment and Python version with Pyenv and Poetry ├── optimization - files from optimization of hyperparameters ├── results - study-db files and parameters │ ├── algorithm_parameters - optimized hyperparameters for agents ├── src │ ├── A_main.py - main script to call for optimization, training, execution │ ├── B_optimization_analyzer.py - analysing the optuna optimization study │ ├── C_training_path_analyzer.py │ ├── D_recommender.py - recommend the next action from a policy (based on a trained agent) │ ├── XX_experiment_factory.py │ ├── XX_hyperparams.py │ ├── XX_plotting.py │ ├── XX_TBM_environment.py - defining the RL environment and reward function ├── .gitignore ├── makefile - covenience functionality for file logistics ├── poetry.lock - exact version of all dependencies ├── pyproject.toml - rules for dependencies and div. settings ├── environment.yaml - dependency file to use with conda ├── README.md

To clone the repository and have everything set up for you, run:

bash git clone https://github.com/TunnRL/TunnRL_TBM_maintenance.git

Requirements

We have organized 2 ways of setting up the environment, downloading and installing all required pacakages, using the same package versions as have been used in development. In this way it is possible to repeat the experiments as close as possible.

The recommended way is to use the poetry system to set up the environment and install all dependencies. Poetry is stricter on depedencies than conda and define all depedencies in a human readable way through the categorized pyproject.toml file. The poetry.lockdefines exact version of all dependencies.

Make sure you have installed pyenv to control your python version. Install the python version and continue. If you don't have poetry and pyenv installed, we have made bash-scripts that installs these in your linux system. NOTE: If you haven't got linux you can run linux from windows by activating Window Subsystem for Linux: https://learn.microsoft.com/en-us/windows/wsl/install

Run these scripts in your terminal to install:

bash install_pyenv.sh install_poetry.sh

Install the Python version used in this environment. This will take some time:

bash pyenv install -v 3.10.5

cd into your project dir and activate the Python version:

bash pyenv local 3.10.5

Check your python version:

bash python -V

Set up environment and install all depedencies:

bash poetry install

Running this will install all dependencies defined in poetry.lock.

Activate the environment with

bash poetry shell

Then you are ready to run your Python scripts in the exact same system setup as it has been developed!

Another way is to use conda.

Create an environment called rl_cutter using environment.yaml with the help of conda. If you get pip errors, install pip libraries manually, e.g. pip install pandas

bash conda env create --file environment.yaml

Activate the new environment with:

bash conda activate rl_cutter

Sqlite for Optuna optimization of parameters in parallell

To use hyperparameter functionality you need to have the database engine SQlite installed. This is by default installed in Linux, but not in Windows.

Sqlite make it possible to have one common study-file for optimization that a number of terminal-sessions (utilizing all the cores on a computer) or computers can access at the same time. This makes it possible to run optimization of hyperparameters in parallell, greatly speeding up the process, which in reinforcement learning is computationally demanding.

Simply kick off a number of similar runs with the same study-name and all processes will update the same study-db.

Principles for training an RL-agent

We use the quality controlled implementation of RL-agents in Stable Baselines 3 (implemented in Pytorch). In setting up the customized RL-environment we follow the API from Open AI gym by inheriting our custom environment from gym.env.

The basic principles for training an agent follow these steps (functionality included in scripts):

Instantiate the environment: env = CustomEnv(...). This lays out the state of the cutters defined in a state vector of cutter life for all cutters, initially assigned with a max life of 1. Another vector defines the penetration value for all steps in the episode.
Instantiate the agent with: agent = PPO(...) (a number of different agents are defined)
Instantiate callbacks with: callback = CallbackList([eval_cb, custom_callback]. Callbacks are not a part of the actual training process but provides functionality for logging metrics, early stopping etc.
Train the agent by looping over episodes, thereby making new actions and changin the state, each time with a new version of the environment. This functionality is wrapped into the learn function in Stable Baselines3. agent.learn(total_timesteps=self.EPISODES * self.MAX_STROKES, callback=callback).

In every episode (say 10 000 episodes), the agent takes (loops over) a number of steps (which is TBM-strokes in this environment, eg. 1000 strokes of 1.8 meter). In each step a MLP-neural network is trained to match the best actions to the given state, ie. that maximize the reward. The MLP's are used in different ways for the different agent architectures: PPO, A2C, DDPG, TD3, SAC. This training session is a classic NN-machine learning session looping over a number of epochs (eg 10 epochs) in order to minimize the loss-function.

How to use the functionality - in general terms

In A_main.py choose an agent architecture (PPO, DDPG, TD3 etc.) and run an optimization process with Optuna to optimize hyperparameters to achieve the highest reward for that architecture.
- Optimization data is saved in the optimizationdirectory and a subdirectory for each model run. Data is updated in this subdirectory for every chosen episode interval (eg. every 100 episode in a 10 000 episode study).
- Each time one model-run is completed, common data-files for all experiments are saved into the resultsdirectory. Run B_optimization_analyzer.py to visualize this data.
Train an agent for a number of episodes for a certain architecture and parameters given from an Optuna optimization for that architecture.
- Metrics are saved into the checkpointdirectory.
- Visualize the training process with C_training_paty_analyzer.py
Execute to execute the actions for a trained agent.
- To recommend the actions (cutter maintenance) for the next step (stroke) use the policy from a trained agent and run D_recommender.py.

Owner

Name: Reinforcement learning in tunneling
Login: TunnRL
Kind: organization
Email: georg.erharter@ngi.no; tom.frode.hansen@ngi.no; erharter@tugraz.at
Location: Graz (Austria); Oslo (Norway)

Repositories: 1
Profile: https://github.com/TunnRL

Joint organization of Norwegian Geotechnical Institute & TU Graz - Institute for Rock Mechanics and Tunnelling to further reinforcement learning for tunneling.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Erharter"
  given-names: "Georg H."
  orcid: "https://orcid.org/0000-0002-7793-9994"
- family-names: "Hansen"
  given-names: "Tom F."
  orcid: "https://orcid.org/0000-0002-1020-7537"
title: "Code to the paper 'TunnRL-CC: A computational framework for smart TBM cutter changing'"
version: 2.0.0
date-released: 2022-XX-XX
url: "https://github.com/TunnRL/TunnRL_TBM_maintenance"
preferred-citation:
  type: article
  authors:
  - family-names: "Erharter"
    given-names: "Georg H."
    orcid: "https://orcid.org/0000-0002-7793-9994"
  - family-names: "Hansen"
    given-names: "Tom F."
    orcid: "https://orcid.org/0000-0002-1020-7537"
  doi: "XXXXXXXXXXXXX"
  url: "XXXXXXXXXXXXX"
  journal: "Computers and Geotechnics"
  issn: "XXXXXXXXXXXXX"
  month: "XXXXXXXXXXXXX"
  title: "TunnRL-CC: A computational framework for smart TBM cutter changing"
  volume: "XXXXXXXXXXXXX"
  year: "XXXXXXXXXXXXX"
  keywords: "XXXXXXXXXXXXX"
  abstract: "XXXXXXXXXXXXX"

GitHub Events

Total

Last Year

Dependencies

poetry.lock pypi

129 dependencies

pyproject.toml pypi

black ^22.6.0 develop
flake8 ^5.0.3 develop
flake8-annotations ^2.9.1 develop
ipdb ^0.13.9 develop
isort ^5.10.1 develop
mypy ^0.971 develop
pyupgrade ^2.37.3 develop
rich ^12.5.1 develop
setuptools <60 develop
importlib-resources ^5.9.0
joblib ^1.1.0
mlflow ^1.28.0
numpy ^1.23.1
optuna ^2.10.1
pysqlite3 ^0.4.7
python >=3.10.5,<3.12
scikit-learn ^1.1.1
stable-baselines3 ^1.6.0
tensorboard ^2.10.0
umap-learn ^0.5.3

environment.yaml conda

joblib 1.1.0.*
matplotlib 3.5.1.*
mlflow 1.28.0.*
numpy 1.23.1.*
pandas
pip
python 3.10.*
scikit-learn 1.1.1.*
spyder-kernels 2.3.*

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science