glitchflow-itwinai-plugin

Glitchflow plugin for itwinai

https://github.com/intertwin-eu/glitchflow-itwinai-plugin

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.2%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Glitchflow plugin for itwinai

Basic Info
  • Host: GitHub
  • Owner: interTwin-eu
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 293 KB
Statistics
  • Stars: 1
  • Watchers: 3
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 11 months ago · Last pushed 8 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation Codeowners Authors

README.md

Index

DT Virgo Use Case

The sensitivity of Gravitational Wave (GW) interferometers is limited by noise. We have been using Generative Neural Networks (GenNNs) to produce a Digital Twin (DT) of the Virgo interferometer to realistically simulate transient noise in the detector. We have used the GenNN-based DT to generate synthetic strain data (a channel that measures the deformation induced by the passage of a gravitational wave). Furthermore, the detector is equipped with sensors that monitor the status of the detector’s subsystems as well as the environmental conditions (wind, temperature, seismic motions) and whose output is saved in the so-called auxiliary channels. Therefore, in a second phase, also from the perspective of the Einstein Telescope, we will use the trained model to characterise the noise and optimise the use of auxiliary channels in vetoing and denoising the signal in low-latency searches, i.e., those data analysis pipelines that search for transient astrophysical signals in almost real time. This will allow the low-latency searches (not part of the DT) to send out more reliable triggers to observatories for multi-messenger astronomy. Figure 1 shows the high-level architecture of the DT. Data streams from auxiliary channels are used to find the transfer function of the system producing non-linear noise in the detector output. The output function compares the simulated and the real signals in order to issue a veto decision (to further process incoming data in low-latency searches) or to remove the noise contribution from the real signal (denoising).

High-level architecture of the DT
Figure 1: High-level architecture of the DT.

Figure 2 shows the System Context diagram of the DT for the veto and denoising pipeline. Two main subsystems characterise the DT architecture: the Training DT subsystem and the Inference DT subsystem. The Training DT subsystem is responsible for the periodical re-training of the DT model on a buffered subsample of the most recent Virgo data. The DT model needs to be updated to reflect the current status of the interferometer, so continuous retraining of the GenNN needs to be carried out. tThe Inference DT subsystem is responsible for the low-latency vetoing and denoising of the detector’s datastream. All modules within both subsystems are implemented as itwinai plugins. Itwinai offers several key features that are beneficial to the DT, including distributed training capabilities, a robust logging and model catalog system, enhanced code reusability, and a user-friendly configuration interface for pipelines.

System context diagram of the DT.
Figure 2: System context diagram of the DT.

The Training DT Subsystem

Operators initiate the Training Subsystem. The ANNALISA module first selects relevant channels for network training by analyzing time-frequency data (Q-Transform) to find correlations measured as coincident spikes in signal energy above a threshold.

After this initial step, operators preprocess data retrieved from the Virgo Data Lake. ANNALISA handles this preprocessing, which includes data resampling, whitening, spectrogram generation, image cropping, and loading into a custom PyTorch dataloader. This dataloader then feeds a Generative Neural Network (GenNN) during training.

The chosen neural network is a Convolutional U-net, featuring residual blocks and attention gates with enhanced skip connections for better data complexity capture. Other architectures are available for the user to experiment. The GlitchFlow module manages both the model definition and training. As the model trains, its weights and performance metrics are systematically logged into a dedicated model registry on MLFlow, making it accessible for the Inference Subsystem. Itwinai facilitates this logging and offloading to computing infrastructure during training.

The Inference DT Subsystem

Users, typically GW detector characterization or data analysis experts, activate the Inference Subsystem. They start by selecting the data for analysis, which then undergoes the same preprocessing steps as those applied during the training phase. Subsequently, a trained model is loaded from the model catalog and utilized to perform inference on the chosen data.

The output of this process comprises "clean" data, ideally free of glitches, and metadata containing veto flagging information, which identifies glitch instances. Both the cleaned data and metadata are logged, offering a complete record of the denoising and vetoing operations.

Logged details, including images of the real, generated, and cleaned data, are accessible on TensorBoard. Metadata containing veto flag information, organized by the GPS time of the analyzed data, is also logged. Furthermore, metadata for any data that failed to be cleaned is recorded, including the area and Signal-to-Noise Ratio (SNR) of glitches still visible after cleaning. To access this information, users can launch TensorBoard and navigate through the logged events, which are categorized by run and timestamp, allowing for detailed visualization and analysis of the inference results. The entire pipeline, encompassing data selection, inference, and logging, is configurable via a YAML file, enabling users to specify modules to execute, preprocessing parameters, dataset specifics, network architecture, and paths for saving results.

Technical documentation

The following shows how to set up and run the Virgo DT pipeline.

Requirements

  • itwinai==0.3.0
  • torch==2.4.1
  • torchaudio==2.4.1
  • torchvision==0.19.1
  • torchmetrics==1.6.2
  • gwpy==3.0.12
  • h5py==3.13.0
  • numpy==1.26.4
  • matplotlib==3.10.1
  • pandas==2.2.3
  • scipy==1.14.1
  • tensorboard==2.19.0
  • mlflow==2.20.3
  • ml4gw==0.7.4
  • scikit-image==0.25.2
  • scikit-learn==1.6.1
  • tensorflow==2.19.0

For itwinai follow the official page
https://itwinai.readthedocs.io/latest/installation/user_installation.html

Other packages that could require the official documentation
https://github.com/ML4GW/ml4gw
https://www.mlflow.org/docs/latest/ml/tracking/quickstart
https://www.tensorflow.org/install?hl=it

Installation and configuration

Execute setup.sh to create the directory tree in your working directory.

chmod +x setup.sh

./setup.sh

Then your current directory will look like

.
├── current dir/
    └── annalisarun  #Saved Annalisa Data
    └── datasets     #Processed dataset
    └── QTdatasets   #Spectrograms
    └── temp         #Data saved during training

The saveconf.yaml file contained inside the conf directory can be used to adjust the directory tree to the user setup.

The other files contained in the conf directory define the processing pipeline parameters (see each file for detailed list):

Modules

  • Data.py: class containing data structures and methods for data preprocessing used in the pipeline such as:

    • The TFrame class used for reading pytorch tensors and relative metadata during the pipeline workflow
    • Method for reading and processing gw data
    • Methods and classes for working with different data format like yaml and json
    • Methods used for preprocessing the dataset before model training
    • Various custom matplotlib plotting functions
  • Dataloader.py: Itwinai's classes for data loading steps. It provides:

    • Processing of gw data
    • Dataset splitting and preprocessing before training
    • Loading data for inference
    • Spectrogram dataset visualization utility.
  • Scanner.py: Itwinai's class selecting relevant channels for network training by analyzing time-frequency data (Q-Transform) to find correlations measured as coincident spikes in signal energy above a threshold. Parameters can be defined via scan.yaml file. Results are stored locally, path can be configured by user.

  • Spectrogram.py: Itwinai's class for transforming a dataset of timeseries into a dataset of spectrograms via Q-transform. Parameters can be defined via process.yaml file for the Q-transform and for the whitening of data the whiten.yaml file is read.

  • Model.py: Class for Neural Network architecture definition and the metrics used during the training and inference step. During the inference step the model is retrieved from the MLFlow catalogue.

  • Trainer.py: TorchTrainer class used for model training. See itwinai documentation for more details https://itwinai.readthedocs.io/latest/how-it-works/training/training.html#itwinai-torchtrainer.

  • Inference.py: Class for inference, denoising and veto.

## Pipeline execution

To execute the pipeline use itwinai syntax. Assuming the working directory is the same as the config.yaml file's:

itwinai exec-pipeline +pipekey="pipeline name" +pipesteps=[List containing the steps to execute]

The user can select which pipeline to execute via the pipe_key parameter. The predefined pipelines are: - preproc_pipeline: Involves dataset preprocessing, channel selection and spectrogram dataset creation - training_pipeline: Involves dataset splitting, filtering and NN training. Logs weights, metrics and metadata on MLFlow and TensorBoard - inference_pipeline: Feeds inference dataset to pretrained NN model performing denoising. Logs metrics and metadata on TensorBoard - vis_dts: Allows for visualization of denoised data, accuracy metrics, and other metadata via TensorBoard - glitchflow_pipeline: Generates a synthetic dataset given a pretrained NN

If pipe_key is not specified, the training_pipeline will be executed by default. The user can further select the pipeline's substeps and their order to execute via the pipe_steps argument; if not given, the whole pipeline will be executed. See config.yaml for all substeps of each pipeline.

For example, the preprocessing pipeline:

itwinai exec-pipeline +pipekey=preprocpipeline

will execute the following steps:

  • Data-processor the data preprocessing step
  • Annalisa-scan: the channel selection algorithm
  • QT-dataset: the spectrogram dataset creation

If however the user wants to perform a second channel selection and spectrogram dataset creation using different parameters (modifying the relative configuration files scan.yaml for Annalisa-scan and process.yaml and whiten.yaml for QT-dataset ) on an already preprocessed dataset they can run:

itwinai exec-pipeline +pipekey=preprocpipeline +pipe_steps=[Annalisa-scan,QT-dataset]

Data Visualization and Logging

The DT uses MLFlow and TensorBoard for logging thanks to itwinai integration. For installation, refer to the official documentation (see Requirements).
In case of a local setup, the python installation should be enough.

MLFlow

To launch MLFlow:

mlflow server --backend-store-uri sqlite:///mlflow.db --default-artifact-root ./artifacts --host ip adress --port 5005

  • --backend-store-uri: the database that will be used to store data. sqlite is the default db but other databases can be used.
  • --default-artifact-root: MLFlow directory where data will be stored. Logged model will be found here.
  • --host: the ip adress of the server. Put it to 0.0.0.0 if working behind a proxy.
  • --port: 5005 is the default port.

User can launch MLFlow and navigate through different experiments and runs, allowing for detailed display of: - Model weights - Training and validation loss - Accuracy metrics for both denosing and vetoing task

Examples are reported in the figures below:

Metrics Dashboard
Figure 3: Overview of training and validation loss, model accuracy.

Models Overview
Figure 4: Summary of available models.

Runs Log
Figure 5: Log of recent training and evaluation runs for each experiment.

TensorBoard

To launch TensorBoard:

tensorboard --logdir logdir --host ip --port 6000

  • --logdir: tensorboard root directory.
  • --host: the ip adress of the server. Put it to 0.0.0.0 if working behind a proxy.
  • --port: 6000 is the default port.

User can launch TensorBoard and navigate through the logged events, which are categorized by run and timestamp, allowing for detailed visualization and analysis of the inference results comprising of: - images of the real, generated, and cleaned data - Metadata containing veto flag information, organized by the GPS time of the analyzed data - metadata for any data that failed to be cleaned is recorded, including the area and Signal-to-Noise Ratio (SNR) of glitches still visible after cleaning - accuracy metrics for both denosing and vetoing task

Examples are reported in the figures below:

Training accuracy
Figure 6: Training accuracy as a function of learning epochs for different values for fixed $SNR^2$ threshold.

Denoising inference
Figure 7: On the left: Denoising inference. Real, generated spectrograms of data used for inference and their absolute difference. On the center and right: Denoising and vetoing accuracy as a function of different $SNR^2$ threshold after training.

Training loss
Figure 8: Training and validation loss as a funciton of learning epochs.

Veto metadata
Figure 9: Denoising metadata. In the table are reported the gps time of the data used for inference, a binary flag to indicate if the data was succesfully cleaned, the maximum $SNR^2$ for uncleaned data, and the area (in pixels) of residual glitch after (failed) cleaning.

Owner

  • Name: interTwin Community
  • Login: interTwin-eu
  • Kind: organization
  • Email: info@intertwin.eu

Co-designing and prototyping an interdisciplinary Digital Twin Engine.

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: itwinai-plugin-template
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software

authors:
  - given-names: Matteo
    family-names: Bunino
    email: matteo.bunino@cern.ch
    affiliation: CERN
    orcid: 'https://orcid.org/0009-0008-5100-9300'
  
repository-code: 'https://github.com/interTwin-eu/itwinai-plugin-template'
url: 'https://itwinai.readthedocs.io/'
abstract: AI on cloud and HPC made simple for science
keywords:
  - Artificial intelligence
  - Machine learning
  - Digital twins
  - Climate research
  - Physics research
license: Apache-2.0

GitHub Events

Total
  • Watch event: 1
  • Member event: 1
  • Push event: 3
  • Pull request event: 1
  • Create event: 3
Last Year
  • Watch event: 1
  • Member event: 1
  • Push event: 3
  • Pull request event: 1
  • Create event: 3

Dependencies

.github/workflows/check-links.yml actions
  • actions/checkout v4 composite
  • gaurav-nelson/github-action-markdown-link-check v1 composite
.github/workflows/lint.yml actions
  • actions/checkout v4 composite
  • github/super-linter/slim v7 composite
.github/workflows/pytest.yaml actions
  • actions/checkout v4 composite
.github/workflows/sqaaas.yaml actions
  • eosc-synergy/sqaaas-assessment-action v2 composite
  • eosc-synergy/sqaaas-step-action v1 composite
Dockerfile docker
  • ghcr.io/intertwin-eu/itwinai torch-slim-latest build
pyproject.toml pypi
  • itwinai [torch]
  • pytest >=8.3.4
uv.lock pypi
  • 158 dependencies