disruption-prediciton-based-on-multimodal-deep-learning
Research-repository: Disruption Prediction and Analysis through Multimodal Deep Learning in KSTAR
https://github.com/zinzinbin/disruption-prediciton-based-on-multimodal-deep-learning
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 5 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, sciencedirect.com, springer.com -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary
Keywords
Repository
Research-repository: Disruption Prediction and Analysis through Multimodal Deep Learning in KSTAR
Basic Info
- Host: GitHub
- Owner: ZINZINBIN
- Language: Jupyter Notebook
- Default Branch: main
- Homepage: https://www.sciencedirect.com/science/article/abs/pii/S0920379624000577
- Size: 196 MB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
# Disruptive prediction model using KSTAR video and numerical data via Deep Learning [Paper : Disruption Prediction and Analysis Through Multimodal Deep Learning in KSTAR]
Introduction
How to generate training data
- Firstly, we set image sequence data as an input data (B,T,C,W,H) and assumed that the last frame where the image of the plasma in tokamak disapper is a disruption.
- Then, the last second frame of the image sequence can be considered as a current quench.
- Thus, the frame sequences including the last second frame of each experiment data, are labeled as disruptive.
- Under this condition, the neural networks trained by these labeled dataset can predict the disruption prior to a current quench.
The model performance of disruption prediction
Analysis of the models using visualization of hidden vectors
Analysis of the models using permutation feature importance
Analysis of the models using GradCAM and attention rollout
Enhancement of predicting disruptions using multimodal learning
Enviornment
The code was developed using python 3.9 on Ubuntu 18.04
The GPU used : NVIDIA GeForce RTX 3090 24GB x 4
The resources for training networks were provided by PLARE in Seoul National University
How to Run
setting
Environment
conda create env -f environment.yaml conda activate research-envVideo Dataset Generation : old version, inefficient memory usage and scalability ```
generate disruptive video data and normal video data from .avi
python3 ./src/generatevideodatafixed.py --fps 210 --duration 21 --distance 5 --savepath './dataset/'
train and test split with converting video as image sequences
python3 ./src/preprocessing.py --testratio 0.2 --validratio 0.2 --videodatapath './dataset/dur21dis0' --savepath './dataset/dur21_dis0' ```
Video Dataset Generation : new version, more efficient than old version ```
additional KSTAR shot log with frame information of the video data
python3 ./src/generatemodifiedshot_log.py
generate video dataset from extended KSTAR shot log : you don't need to split the train-test set for every distance
python3 ./src/generatevideodata.py --fps 210 --rawvideopath "./dataset/rawvideos/rawvideos/" --dfshotlistpath "./dataset/KSTARDisruptionShotListextend.csv" --savepath "./dataset/temp" --width 256 --height 256 --overwrite True ```
0D Dataset Generation (Numerical dataset) ```
interpolate KSTAR data and convert as tabular dataframe
python3 ./src/generatenumericaldata.py ```
Test
Test code before model training : check the invalid data or issues from model architecture ```
test all process : data + model
pytest test
test the data validity
pytest test/test_data.py
test the model validity
pytest test/test_model.py ```
Model training process
Models for video data
python3 train_vision_nework.py --batch_size {batch size} --gpu_num {gpu num} --use_LDAM {bool : use LDAM loss} --model_type {model name} --tag {name of experiment / info} --use_DRW {bool : use Deferred re-weighting} --use_RS {bool : use re-sampling} --seq_len {int : input sequence length} --pred_len {int : prediction time} --image_size {int}Models for 0D data
python3 train_0D_nework.py --batch_size {batch size} --gpu_num {gpu num} --use_LDAM {bool : use LDAM loss} --model_type {model name} --tag {name of experiment / info} --use_DRW {bool : use Deferred re-weighting} --use_RS {bool : use re-sampling} --seq_len {int : input sequence length} --pred_len {int : prediction time}Models for MultiModal(video + 0D data)
python3 train_multi_modal.py --batch_size {batch size} --gpu_num {gpu num} --use_LDAM {bool : use LDAM loss} --use_GB {bool : use Deferred re-weighting} --tag {name of experiment / info} --use_DRW {bool : use Deferred re-weighting} --use_RS {bool : use re-sampling} --seq_len {int : input sequence length} --pred_len {int : prediction time} --tau {int : stride for input sequence}
Experiment
Experiment for each network(vision, 0D, multimodal) with different prediction time ```
R1Plus1D
sh exp/exp_r1plus1d.sh
Slowfast
sh exp/exp_slowfast.sh
ViViT
sh exp/exp_vivit.sh
Transformer
sh exp/exp0Dtransformer.sh
CnnLSTM
sh exp/exp0Dcnnlstm.sh
MLSTM-FCN
sh exp/exp0Dmlstm.sh
Multimodal model
sh exp/exp_multi.sh
Multimodal model with Gradient Blending
sh exp/expmultigb.sh ```
Experiment with different learning algorithms and models ```
case : R2Plus1D
sh exp/explar2plus1d.sh
case : SlowFast
sh exp/explaslowfast.sh
case : ViViT
sh exp/explavivit.sh ```
Model performance visualization for continuous disruption prediction using gif
python3 make_continuous_prediction.py
Detail
Model to use
Video encoder
- R2Plus1D : https://github.com/irhum/R2Plus1D-PyTorch
- Slowfast : https://github.com/facebookresearch/SlowFast
- ViViT : https://github.com/rishikksh20/ViViT-pytorch
0D data encoder
- Transformer : paper(https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf), application code(https://www.kaggle.com/general/200913)
- Conv1D-LSTM using self-attention : https://pseudo-lab.github.io/Tutorial-Book/chapters/time-series/Ch5-CNN-LSTM.html
- MLSTM_FCN : paper(https://arxiv.org/abs/1801.04503), application code(https://github.com/titu1994/MLSTM-FCN)
Multimodal Model
- Multimodal fusion model: video encoder + 0D data encoder
- Tensor Fusion Network
- Other methods (Future work)
- Multimodal deep representation learning for video classification : https://link.springer.com/content/pdf/10.1007/s11280-018-0548-3.pdf?pdf=button
- Truly Multi-modal YouTube-8M Video Classification with Video, Audio, and Text : https://static.googleusercontent.com/media/research.google.com/ko//youtube8m/workshop2017/c06.pdf
Technique or algorithm to use
Solving imbalanced classificatio issue
- Re-Sampling : ImbalancedWeightedSampler, Over-Sampling for minor classes
- Re-Weighting : Define inverse class frequencies as weights to apply with loss function (CE, Focal Loss, LDAM Loss)
- LDAM with DRW : Label-distribution-aware margin loss with deferred re-weighting scheduling
- Multimodal Learning : Gradient Blending for avoiding sub-optimal due to large modalities
- Multimodal Learning : CCA Learning for enhancement
Analysis on physical characteristics of disruptive video data
- CAM : proceeding
- Grad CAM : paper(https://arxiv.org/abs/1610.02391), target model(R2Plus1D, SlowFast)
- attention rollout : paper(https://arxiv.org/abs/2005.00928), target model(ViViT)
Data augmentation
- Video Mixup Algorithm for Data augmentation(done, not effective)
- Conventional Image Augmentation(Flip, Brightness, Contrast, Blur, shift)
Training Process enhancement
- Multigrid training algorithm : Fast training for SlowFast
- Deep CCA : Deep cannonical correlation analysis to train multi-modal representation
Generalization and Robustness
- Add noise with image sequence and 0D data for robustness
- Multimodality can also guarantee the robustness from noise of the data
- Gradient Blending for avoiding sub-optimal states from multi-modal learning
Additional Task
- Multi-GPU distributed Learning : done
- Database contruction : Tabular dataset(IKSTAR) + Video dataset, done
- ML Pipeline : Tensorboard, done
Dataset
- Disruption : disruptive state at t = tipminf (current-quench)
- Borderline : inter-plane region (not used)
- Normal : non-disruptive state
📖 Citation
If you use this repository in your research, please cite the following:
📜 Research Article
Disruption prediction and analysis through multimodal deep learning in KSTAR
Kim, Jinsu, et al. "Disruption prediction and analysis through multimodal deep learning in KSTAR." Fusion Engineering and Design 200 (2024): 114204.
📌 Code Repository
Jinsu Kim (2024). Disruption-Prediciton-based-on-Multimodal-Deep-Learning. GitHub.
https://github.com/ZINZINBIN/Disruption-Prediciton-based-on-Multimodal-Deep-Learning
📚 BibTeX:
bibtex
@software{Kim_Deep_Multimodal_Learning_2024,
author = {Kim, Jinsu},
doi = {https://doi.org/10.1016/j.fusengdes.2024.114204},
license = {MIT},
month = feb,
title = {{Deep Multimodal Learning based KSTAR Disruption Prediction Model}},
url = {https://github.com/ZINZINBIN/Disruption-Prediciton-based-on-Multimodal-Deep-Learning},
version = {1.0.0},
year = {2024}
}
Owner
- Name: KIM JINSU
- Login: ZINZINBIN
- Kind: user
- Location: Seoul, Republic of Korea
- Company: Seoul National University
- Repositories: 6
- Profile: https://github.com/ZINZINBIN
BS : Nuclear Engineering / Physics
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite our article and repository."
authors:
- family-names: "Kim"
given-names: "Jinsu"
orcid: "https://orcid.org/0009-0000-2610-4551"
title: "Deep Multimodal Learning based KSTAR Disruption Prediction Model"
version: "1.0.0"
doi: "https://doi.org/10.1016/j.fusengdes.2024.114204" # Replace with your DOI
url: "https://github.com/ZINZINBIN/Disruption-Prediciton-based-on-Multimodal-Deep-Learning"
date-released: "2024-02-02"
license: "MIT"
GitHub Events
Total
- Watch event: 2
- Push event: 2
Last Year
- Watch event: 2
- Push event: 2