sms

Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"

https://github.com/zib-iol/sms

Science Score: 41.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.6%) to scientific vocabulary

Keywords

averaging deep-learning neural-network optimization pruning pytorch sparsity

Scientific Fields

Materials Science Physical Sciences - 40% confidence
Last synced: 4 months ago · JSON representation ·

Repository

Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"

Basic Info
  • Host: GitHub
  • Owner: ZIB-IOL
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 57.6 KB
Statistics
  • Stars: 9
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
averaging deep-learning neural-network optimization pruning pytorch sparsity
Created over 2 years ago · Last pushed about 1 year ago
Metadata Files
Readme Citation

README.md

[ICLR24] Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging

Authors: Max Zimmer, Christoph Spiegel, Sebastian Pokutta

This repository contains the code to reproduce the experiments from the ICLR24 paper "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging". The code is based on PyTorch 1.9 and the experiment-tracking platform Weights & Biases. See the blog post or the twitter thread for a TL;DR.

Structure and Usage

Structure

Experiments are started from the following file:

  • main.py: Starts experiments using the dictionary format of Weights & Biases.

The rest of the project is structured as follows:

  • strategies: Contains the strategies used for training, pruning and model averaging.
  • runners: Contains classes to control the training and collection of metrics.
  • metrics: Contains all metrics as well as FLOP computation methods.
  • models: Contains all model architectures used.
  • utilities: Contains useful auxiliary functions and classes.

Usage

An entire experiment is subdivided into multiple steps, each being multiple (potentially many) different runs and wandb experiments. First of all, a model has to be pretrained using the Dense strategy. This steps is completely agnostic to any pruning specifications. Then, for each phase or prune-retrain-cycle (specified by the n_phases parameter and controlled by phase parameter), the following steps are executed: 1. Strategy IMP: Prune the model using the IMP strategy. Here, it is important to specify the ensemble_by, split_val and n_splits_total parameters: - ensemble_by: The parameter which is varied when retraining multiple models. E.g. setting this to weight_decay will train multiple models with different weight decay values. - split_val: The value by which the ensemble_by parameter is split. E.g. setting this to 0.0001 while using weight_decay as ensemble_by will retrain a model with weight decay 0.0001, all else being equal. - n_splits_total: The total number of splits for the ensemble_by parameter. If set to three, the souping operation in the next step will expect three models to be present, given the ensemble_by configuration. 2. Strategy Ensemble: Souping the models. This step will average the weights of the models specified by the ensemble_by parameter. The ensemble_by parameter has to be the same as in the previous step. n_splits_total has to be the same as well. split_val is not used in this step and has to be set to None. The ensemble_method parameter controls how the models are averaged.

Citation

In case you find the paper or the implementation useful for your own research, please consider citing:

@inproceedings{zimmer2024sparse, title={Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging}, author={Max Zimmer and Christoph Spiegel and Sebastian Pokutta}, booktitle={The Twelfth International Conference on Learning Representations}, year={2024}, url={https://openreview.net/forum?id=xx0ITyHp3u} }

Owner

  • Name: IOL Lab
  • Login: ZIB-IOL
  • Kind: organization
  • Location: Germany

Working on optimization and learning at the intersection of mathematics and computer science

Citation (citation.bib)

@article{zimmer2023sparse,
  title={Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging},
  author={Zimmer, Max and Spiegel, Christoph and Pokutta, Sebastian},
  journal={arXiv preprint arXiv:2306.16788},
  year={2023}
}

GitHub Events

Total
  • Watch event: 2
  • Fork event: 1
Last Year
  • Watch event: 2
  • Fork event: 1

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels