team_pytorch

This repository provides a PyTorch implementation of the TEAM model, originally developed in TensorFlow. The purpose of this project is to convert the original TensorFlow codebase into PyTorch to facilitate model training and experimentation using the PyTorch framework.

https://github.com/wen-wei-055/team_pytorch

Last synced: 6 months ago · JSON representation ·

Repository

This repository provides a PyTorch implementation of the TEAM model, originally developed in TensorFlow. The purpose of this project is to convert the original TensorFlow codebase into PyTorch to facilitate model training and experimentation using the PyTorch framework.

Basic Info

Host: GitHub
Owner: wen-wei-055
Language: Python
Default Branch: master
Size: 36.1 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed over 1 year ago

Metadata Files

Readme Citation

README.md

PyTorch Implementation of TEAM (TensorFlow to PyTorch Conversion)

This repository provides a PyTorch implementation of the TEAM model, originally developed in TensorFlow. The purpose of this project is to convert the original TensorFlow codebase into PyTorch to facilitate model training and experimentation using the PyTorch framework.

Original Work

The TEAM model was introduced in the following paper:
Title: The transformer earthquake alerting model: a new versatile approach to earthquake early warning
Authors: Jannes Münchmeyer, Dino Bindi, Ulf Leser, Frederik Tilmann Published in: Geophysical Journal International
Link to the paper: TEAM Paper

The original TensorFlow implementation can be found at this GitHub repository:
Original TensorFlow Code

Conversion to PyTorch

The goal of this project is to ensure the TEAM model functions correctly in PyTorch, maintaining its original accuracy and performance. We aim to make the codebase more accessible to researchers and developers who prefer using PyTorch for deep learning models.

Additional Evaluation Method

In addition to the original evaluation methods used in the paper, we have introduced an extended evaluation process. Specifically, we added an evaluation that includes all stations for each event, allowing for a more comprehensive assessment of the model's performance across different conditions and datasets.

Training

Model and training configurations are defined in json-files. Please consult the folders magloc_configs and pga_configs for example configurations.

To start model training use: python train.py --config [CONFIG].json

To test a config by running a model with only few data points, the command line flag --test_run can be used.

The training saves model weights to the given weight path. In addition, it writes logs for tensorboard to /logs/scalars.

Config options

The configurations are split into model and training parameters. Furthermore, there are three global parameters, the random seed seed, the model type, which currently needs to be set to transformer, and ensemble, the size of the ensemble. By default, no ensemble but a single model is trained. Note that not all parameter combinations are possible for both theoretical and implementation restrictions, and might lead to crashes.

Model parameters

Parameter | Default value | Description ---- | ---- | ---- maxstations | 25 | Maximum number of stations in training waveformmodeldims | (500, 500, 500) | Dimensions of the MLP in the feature extractor outputmlpdims | (150, 100, 50, 30, 10) | Dimensions of the MLP in the mixture density output for magnitude and PGA outputlocationdims | (150, 100, 50, 50, 50) | Dimensions of the MLP in the mixture density output for location wavelength | ((0.01, 10), (0.01, 10), (0.01, 10)) | Wavelength ranges for the position embeddings (Latitude, Longitude, Depth) madparams | {"nheads": 10, "attdropout": 0.0, "initializerrange": 0.02} | Parameters for the multi-head self-attention ffnparams | {'hiddendim': 1000} | Parameters for the Transformer feed-forward layer transformerlayers | 6 | Number of transformer layers hiddendropout | 0.0 | Transformer hidden dropout activation | 'relu' | Activation function for CNNs and MLPs npgatargets | 0 | Number of PGA targets locationmixture | 5 | Size of the Gaussian mixture for location pgamixture | 5 | Size of the Gaussian mixture for PGA magnitudemixture | 5 | Size of the Gaussian mixture for magnitude borehole | False | Whether the data contains borehole measurements biasmagmu | 1.8 | Bias initializer for magnitude mu biasmagsigma | 0.2 | Bias initializer for magnitude sigma biaslocmu | 0 | Bias initializer for location mu biaslocsigma | 1 | Bias initializer for location sigma eventtokeninitrange | None | Initializer for event token. Defaults to ones, if value is None datasetbias | False | Adds a scalar bias term to the output for joint training on multiple datasets noeventtoken | False | Removes event token, disables magnitude and location estimation downsample | 5 | Downsampling factor for the first CNN layer rotation | None | Rotation to be applied to latitude and longitude before the position embedding rotationanchor | None | Point to rotate around skiptransformer | False | Replace the transformer by a pooling layer alternativecoordsembedding | False | Concatenate position instead of adding position embeddings

Training parameters

To accommodate joint training, parameters are split into general training parameters and generator parameters. For training on a single data set all generator parameters can directly be given in the training parameter array. For joint training a list of generator parameter dictionaries needs to be given. Check the given configs for examples.

General training parameters

Parameter | Default value | Description ---- | ---- | ---- weightpath | - | Path to save model weights. Needs to be empty. datapath | - | Path to the training data. If given a list, the model assumes joint training on multiple datasets. overwritesamplingrate | - | If given, all data is resampled to the given sampling rate. Needs to be a divisor of the sampling rate given in the data. ensemblerotation | False | If position embeddings between the different ensemble member should be rotated. singlestationmodelpath | - | Weights of the initial model for the feature extraction. If not given, the model will train a single station model first to initialize the feature extraction. lr | - | Learning rate clipnorm | - | Norm for gradient clipping filtersinglestationbypick | False | For single station training only train on traces containing a pick. workers | 10 | Number of parallel workers for data preprocessing epochssinglestation | - | Number of training epochs for single station model loadmodelpath | - | Initial weights for model. Not recommended, use transfermodelpath instead. transfermodelpath | - | Initial weights for model. Also transfers weights between models with and without borehole data. ensembleload | False | Load weights for each ensemble member from the corresponding member of another ensemble. waitforload | False | Wait if weight file does not exist. Otherwise raises an exception. lossweights | - | Loss weights given as a dict. Depending on the model configuration required parameters are magnitude, location and pga. lrdecaypatience | 6 | Patience for learning rate decay epochsfullmodel | - | Number of training epochs for full model

Generator params

Parameter | Default value | Description ---- | ---- | ---- key | - | Key of the magnitude value in the event metadata batchsize | 32 | Size of training batches cutoutstart, cutoutend | None | Start and end of the possible cutout in seconds relative to the end of the noise. The cutout always indicates the end ot the available data. noiseseconds | 5 | Number of seconds with noise. Only used for expressing cutout boundaries in terms of noise and for times in evaluation. sliding window | False | If true, instead of using zero-padding for real time processing, uses a sliding window, i.e., randomly selects a window accoring to the given cutout. Note that this usually will require training data with more input samples. shuffle | True | Shuffle order of events coordstarget | True | Return target coordinates as outputs oversample | 1 | Number of times to show each event per epoch posoffset | (-21, -69) | Scalar shift applied to latitude and longitude labelsmoothing | False | Enables label smoothing for large magnitudes stationblinding | False | Randomly zeros out stations in each training example magnituderesampling | 3 | Factor to upsample number of large magnitude events adjustmean | True | Sets mean of all waveform traces to zero. Disabling this will cause a knowledge leak! transformtargetonly | False | Only transform coordinates of target coordinates, but not of station coordinates triggerbased | False | Disable data from stations without trigger minupsamplemagnitude | 2 | Minimum magnitude to upsample event above this magnitude disablestationforeshadowing | False | Zeros coordinates for stations without data selectionskew | None | If given, prefers station closer to event pgafrominactive | False | Predict PGA for stations without waveforms too integrate | False | Integrate waveform traces selectfirst | False | Only use closest stations fakeborehole | False | Adds 3 artifical channels to fake borehole data scalemetadata | True | Rescale coordinates. Not required with position embeddings. pgakey | pga | Key for the PGA values in the data set ppicklimit | 5000 | Maximum pick to assume for selection skew. Ensures probability of selection is positive for all stations. coordkeys | None | Keys for the event coordinates in the event metadata. If none will be detected automatically. upsamplehighstationevents | None | Factor to upsample events recorded at many stations pgaselectionskew | None | Similar to selectionskew, but for PGA targets shuffletraindev | False | Shuffle events between training and development set customsplit | None | Use custom split instead of temporal 60:10:30 split. Custom splits are defined in loader.py. minmag | None | Only use events with at least this magnitude decimateevents | None | Integer k, if given only load every kth event.

Evaluation

For evaluating a model use python evaluate.py --experiment_path [WEIGHTS_PATH]. To evaluate PGA estimation as well us --pga, to evaluate warning times use --head_times. By default, the development set is evaluated. To evaluate the test set use the --test flag. Certain further detail options are documented in the python file.

The evaluation creates a evaluation subfolder in the weights path, containing a statistics file, multiple plots and a prediction file. The statistics file includes for each target (magnitude, location, PGA) and each time step the values of performance metrics. Values are: - R2, RMSE, MAE (magnitude) - Hypocentral RMSE and MAE, Epicentral RMSE and MAE (location) - R2, RMSE, MAE (PGA)

The predictions are a pickle file containg a list consisting of: - Evaluation times - Magnitude predictions. Numpy array with shape (times, events, mixture, (alpha, mu, sigma)). - Location predictions. Numpy array with shape (times, events, mixture, (alpha, mu latitude, mu longitude, mu depth, sigma latitude, sigma longitude, sigma depth)) - PGA predictions. List containing one entry for each time, containing list of events. Each event is a numpy array with shape (station, mixture, (alpha, mu, sigma)) - Warning time results, list of events, each event containing: - times of predicted warnings, array with shape (stations, PGA thresholds, alpha) - times of actual warnings, array with shape (stations, PGA thresolds) - distance of stations to event, array with shape (stations,)+ - Values of alpha

Datasets

The dataset for Italy is available at 10.5880/GFZ.2.4.2020.004. The dataset for Chile is available at 10.5880/GFZ.2.4.2021.002. To obtain the dataset for Japan, please run the following commands. Obtaining the data requires an account with NIED. The download script will prompt for you login credentials. python japan.py --action download_events --catalog resources/kiknet_events --output [OUTPUT FOLDER] python japan.py --action extract_events --input [DATA FOLDER] --output [HDF5 OUTPUT PATH] The download sometimes crashed, due to connection issues to NIED. It can be resumed by simply restarting the download job.

The extraction can be calculated in parallel using sharding. To this end use the flag --shards [NUMBER OF SHARDS] and start jobs with --shard_id between 0 and [NUMBER OF SHARDS] - 1. Run all shards with the same configuration, the output path will be adjusted automatically. Use python japan.py --action merge_hdf5 --input [PATH OF ALL SHARDS] --output [HDF5 OUTPUT PATH].

Baselines

Baseline implementations for magnitude estimation and early warning are contained in mag_baselines.py and pga_baselines.py. For reference on the usage please see the samples configs in mag_baseline_configs and pga_baseline_configs and the implementation.

Owner

Name: william
Login: wen-wei-055
Kind: user

Repositories: 1
Profile: https://github.com/wen-wei-055

Citation (CITATION.md)

If you use this repository or any part of the converted PyTorch code, please cite the following software and publications:

Software Citation:

@misc{munchmeyer2021softwareteam,
  doi = {10.5880/GFZ.2.4.2021.003},
  author = {M\"{u}nchmeyer,  Jannes and Bindi,  Dino and Leser,  Ulf and Tilmann,  Frederik},
  title = {TEAM – The transformer earthquake alerting model},
  publisher = {GFZ Data Services},
  year = {2021},
  note = {V. 1.0},
  copyright = {GPLv3}
}

Key Publications:

@article{munchmeyer2020team,
  title={The transformer earthquake alerting model: A new versatile approach to earthquake early warning},
  author={M{\"u}nchmeyer, Jannes and Bindi, Dino and Leser, Ulf and Tilmann, Frederik},
  journal={Geophysical Journal International},
  year={2020},
  doi={10.1093/gji/ggaa609}
}

@article{munchmeyer2021teamlm,
  title={Earthquake magnitude and location estimation from real time seismic waveforms with a transformer network},
  author={M{\"u}nchmeyer, Jannes and Bindi, Dino and Leser, Ulf and Tilmann, Frederik},
  journal={Geophysical Journal International},
  year={2021},
  doi={10.1093/gji/ggab139}
}

Please make sure to reference these works in any publications or projects that make use of this repository.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

team_pytorch

Science Score: 44.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

PyTorch Implementation of TEAM (TensorFlow to PyTorch Conversion)

Original Work

Conversion to PyTorch

Additional Evaluation Method

Training

Config options

Model parameters

Training parameters

General training parameters

Generator params

Evaluation

Datasets

Baselines

Owner

Citation (CITATION.md)

GitHub Events

Total

Last Year

Dependencies