lstmix

https://github.com/jpv219/lstmix

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.2%) to scientific vocabulary

Last synced: 11 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: jpv219
Language: Python
Default Branch: main
Size: 61.8 MB

Statistics

Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 2

Created almost 3 years ago · Last pushed about 2 years ago

Metadata Files

Readme Citation

LSTMIX: Multivariate multistep RNN predictive framework for multidimensional mixing performance metrics.

Overview

This Python repository provides a comprehensive framework for preprocessing and augmenting multivariate and multidimensional mixing performance metrics and statistics to train Recurrent Neural Networks with different cell types (LSTM and GRU) and network architectures (Fully-Connected and Encoder-Decoder) using PyTorch. The framework includes scripts for data generation (read and preprocess), model training, hyperparameter tuning, and prediction generation via sequence rollout, and model uncertainty quantification via ensemble-based perturbations.

Repository Structure

The repository is organized into several key components:

1. `datagen.py`

This script is responsible for generating and preprocessing datasets for training, validation, and testing. It utilizes two additional functionalities:

input.py: Handles the import, organization, scaling, and smoothing of raw data from DNS simulations stored in CSV files.
Windowing class from modeltrain_LSTM.py: Used for augmenting data by windowing and packaging it into corresponding pickle (pkl) files with labels and relevant data for later use.

2. `modeltrain_LSTM.py`

This script contains essential classes and functionalities for model training. It includes:

Windowing class: Implements data windowing for augmentation.
RNN abstract classes and cell-specific child classes: Provides implementations for both fully connected and encoder-decoder architectures using either GRU or LSTM cells.
Model training logic: Conducts the main training process, saving model states, trained datasets, and the hyperparameters used during training.

3. `hyperparam_tuning.py`

This script leverages Ray Tune to perform hyperparameter tuning on the RNN architectures defined in modeltrain_LSTM.py. It aims to optimize the model's performance by systematically exploring hyperparameter combinations.

4. `rollout_prediction.py`

This script facilitates the evaluation of the trained model. It includes functionalities to:

Plot trained and validated datasets.
Execute a rollout operation to predict values from the test set and compare them against the ground truth.
Plot various metrics such as a y=x plot, Wasserstein and K-L divergence plots, and more.

Directory Structure

plaintext LSTMIX/ │ ├── Clean_CSV.py │ ├── config/ │ ├── config_paths.ini │ ├── config_sm.ini │ └── config_sv.ini │ ├── data_gen.py │ ├── figs/ │ ├── input_data/ │ ├── performance_logs/ │ ├── perturbations/ │ ├── rollouts/ │ ├── split_data/ │ ├── temporal_dist/ │ ├── temporal_EMD/ │ └── windowed/ │ ├── hyperparam_tuning.py │ ├── input_data/ │ └── inputdata.pkl │ ├── input.py ├── Load_Clean_DF.py ├── modeltrain_LSTM.py ├── perturbation.py ├── rollout_prediction.py ├── README.md ├── requirements.txt ├── tools_modeltraining.py │ ├── trained_models/ │ ├── data_sets_GRU_ED/ │ ├── data_sets_GRU_FC/ │ ├── data_sets_LSTM_ED/ │ ├── data_sets_LSTM_FC/ │ ├── GRU_ED_logs/ │ ├── GRU_ED_trained_model.pt │ ├── GRU_FC_logs/ │ ├── GRU_FC_trained_model.pt │ ├── hyperparams_GRU_ED.txt │ ├── hyperparams_GRU_FC.txt │ ├── hyperparams_LSTM_ED.txt │ ├── hyperparams_LSTM_FC.txt │ ├── LSTM_ED_logs/ │ ├── LSTM_ED_trained_model.pt │ ├── LSTM_FC_logs/ │ └── LSTM_FC_trained_model.pt │ ├── tuning/ │ ├── best_models/ │ ├── GRU_ED/ │ ├── GRU_FC/ │ ├── LSTM_ED/ │ └── LSTM_FC/ │ └── RawData/ # Not part of the repository, user data

Getting Started

To use this framework, follow these steps:

Clone the repository: git clone https://github.com/your-username/your-repository.git
Install the required dependencies: pip install -r requirements.txt
Execute the data generation script: python datagen.py
Train the LSTM model: python modeltrain_LSTM.py
Tune hyperparameters (optional): python hyperparam_tuning.py
Evaluate model performance: python rollout_prediction.py

Dependencies

PyTorch
Ray Tune
NumPy
Matplotlib
Other dependencies specified in requirements.txt

Acknowledgments

The developers and contributors of PyTorch, Ray Tune, and other open-source libraries used in this framework.

Owner

Name: Juan Pablo Valdes
Login: jpv219
Kind: user
Location: London, U.K.
Company: Imperial College London

Website: https://jpv219.github.io
Repositories: 1
Profile: https://github.com/jpv219

PhD Researcher at Imperial College London, M.Sc., Ch. E.

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: >-
  LSTMIX: Multivariate multistep RNN predictive framework
  for multidimensional mixing performance metrics.
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Juan Pablo
    family-names: Valdes
    email: j.valdes20@imperial.ac.uk
    affiliation: Imperial College London
    orcid: 'https://orcid.org/0000-0003-3249-5194'
  - given-names: Fuyue
    family-names: Liang
    email: fuyue.liang18@imperial.ac.uk
    affiliation: Imperial College London
    orcid: 'https://orcid.org/0000-0002-4159-6993'
repository-code: 'https://github.com/jpv219/LSTMIX'
abstract: >-
  This Python repository provides a comprehensive framework
  for preprocessing and augmenting multivariate and
  multidimensional mixing performance metrics and statistics
  to train Recurrent Neural Networks with different cell
  types (LSTM and GRU) and network architectures
  (Fully-Connected and Encoder-Decoder) using PyTorch. The
  framework includes scripts for data generation (read and
  preprocess), model training, hyperparameter tuning, and
  prediction generation via sequence rollout, and model
  uncertainty quantification via ensemble-based
  perturbations.
keywords:
  - Recurrent Neural Networks
  - LSTM
  - GRU
  - Mixing
commit: ' 9d1dae5'
version: '2.0'
date-released: '2024-04-13'

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

lstmix

Science Score: 44.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

LSTMIX: Multivariate multistep RNN predictive framework for multidimensional mixing performance metrics.

Overview

Repository Structure

1. `datagen.py`

2. `modeltrain_LSTM.py`

3. `hyperparam_tuning.py`

4. `rollout_prediction.py`

Directory Structure

Getting Started

Dependencies

Acknowledgments

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies

lstmix

Science Score: 44.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

LSTMIX: Multivariate multistep RNN predictive framework for multidimensional mixing performance metrics.

Overview

Repository Structure

1. datagen.py

2. modeltrain_LSTM.py

3. hyperparam_tuning.py

4. rollout_prediction.py

Directory Structure

Getting Started

Dependencies

Acknowledgments

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies

1. `datagen.py`

2. `modeltrain_LSTM.py`

3. `hyperparam_tuning.py`

4. `rollout_prediction.py`