sms
Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"
Science Score: 41.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.6%) to scientific vocabulary
Keywords
Scientific Fields
Repository
Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"
Basic Info
Statistics
- Stars: 9
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
[ICLR24] Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging
Authors: Max Zimmer, Christoph Spiegel, Sebastian Pokutta
This repository contains the code to reproduce the experiments from the ICLR24 paper "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging". The code is based on PyTorch 1.9 and the experiment-tracking platform Weights & Biases. See the blog post or the twitter thread for a TL;DR.
Structure and Usage
Structure
Experiments are started from the following file:
main.py: Starts experiments using the dictionary format of Weights & Biases.
The rest of the project is structured as follows:
strategies: Contains the strategies used for training, pruning and model averaging.runners: Contains classes to control the training and collection of metrics.metrics: Contains all metrics as well as FLOP computation methods.models: Contains all model architectures used.utilities: Contains useful auxiliary functions and classes.
Usage
An entire experiment is subdivided into multiple steps, each being multiple (potentially many) different runs and wandb experiments. First of all, a model has to be pretrained using the Dense strategy. This steps is completely agnostic to any pruning specifications. Then, for each phase or prune-retrain-cycle (specified by the n_phases parameter and controlled by phase parameter), the following steps are executed:
1. Strategy IMP: Prune the model using the IMP strategy. Here, it is important to specify the ensemble_by, split_val and n_splits_total parameters:
- ensemble_by: The parameter which is varied when retraining multiple models. E.g. setting this to weight_decay will train multiple models with different weight decay values.
- split_val: The value by which the ensemble_by parameter is split. E.g. setting this to 0.0001 while using weight_decay as ensemble_by will retrain a model with weight decay 0.0001, all else being equal.
- n_splits_total: The total number of splits for the ensemble_by parameter. If set to three, the souping operation in the next step will expect three models to be present, given the ensemble_by configuration.
2. Strategy Ensemble: Souping the models. This step will average the weights of the models specified by the ensemble_by parameter. The ensemble_by parameter has to be the same as in the previous step. n_splits_total has to be the same as well. split_val is not used in this step and has to be set to None. The ensemble_method parameter controls how the models are averaged.
Citation
In case you find the paper or the implementation useful for your own research, please consider citing:
@inproceedings{zimmer2024sparse,
title={Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging},
author={Max Zimmer and Christoph Spiegel and Sebastian Pokutta},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=xx0ITyHp3u}
}
Owner
- Name: IOL Lab
- Login: ZIB-IOL
- Kind: organization
- Location: Germany
- Website: https://iol.zib.de
- Repositories: 27
- Profile: https://github.com/ZIB-IOL
Working on optimization and learning at the intersection of mathematics and computer science
Citation (citation.bib)
@article{zimmer2023sparse,
title={Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging},
author={Zimmer, Max and Spiegel, Christoph and Pokutta, Sebastian},
journal={arXiv preprint arXiv:2306.16788},
year={2023}
}
GitHub Events
Total
- Watch event: 2
- Fork event: 1
Last Year
- Watch event: 2
- Fork event: 1
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0