salsa-optimizer

SaLSa Optimizer implementation (No learning rates needed)

https://github.com/themody/no-learning-rates-needed-introducing-salsa-stable-armijo-line-search-adaptation

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.6%) to scientific vocabulary

Keywords

learning-rate line-search optimization pytorch

Last synced: 10 months ago · JSON representation ·

Repository

SaLSa Optimizer implementation (No learning rates needed)

Basic Info

Host: GitHub
Owner: TheMody
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 2.07 MB

Statistics

Stars: 31
Watchers: 1
Forks: 1
Open Issues: 0
Releases: 0

Topics

learning-rate line-search optimization pytorch

Created about 2 years ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

SALSA - Optimizer

The official Repository to the Paper "No learning rates needed: Introducing SALSA - Stable Armijo Line Search Adaptation". With additional features. If you have any question, remarks or issues with the SALSA-Optimizer please do not hesitate to contact us on Github.

Install

pip install SaLSa-Optimizer or clone the repo and use

pip install .

Use:

Example Usage:

from salsa.SaLSA import SaLSA self.optimizer = SaLSA(model.parameters())

The typical pytorch forward pass needs to be changed from : x,y = load_input_data() optimizer.zero_grad() y_pred = model(x) loss = criterion(y_pred, y) loss.backward() optimizer.step() scheduler.step() to: x,y = load_input_data() def closure(backwards = False): y_pred = model(x) loss = criterion(y_pred, y) if backwards: loss.backward() return loss optimizer.zero_grad() loss = optimizer.step(closure = closure) At the moment gradient scalers are not possible to be used simultaneously with SALSA.

This code change is necessary since, the optimizers needs to perform additional forward passes and thus needs to have the forward pass encapsulated in a function. See embedder.py in the fit() method for exemplary usage.

Disclaimer:

This optimizer was only tested and validated to perform well with MLPs, Transformers, and Convolutional Neural Networks (Mamba was also tested but needed a c value of 0.7). Your results may vary when you try it for other architectures and/or use cases on which it was not validated. When you encounter any issues, try tuning the c value of the optimizer and/or open an issue on Github.

Replicating Results

Youtube Video Explaining the Concept:

Dependencies:

pytorch <3
numpy <3

for replicating the results (not needed for using the optimizer): - pip install transformers for huggingface transformers <3 - pip install datasets for huggingface datasets <3 - pip install tensorflow-datasets for tensorflow datasets <3 - pip install wandb for optional logging <3 - for easy replication use conda and environment.yml eg: $ conda env create -f environment.yml and $ conda activate sls3

The results of the Line Search Algorithm are:

Loss Curve

on average a 50\% reduction in final loss, while only needing about 3\% extra compute time on average.

For replicating the main Results of the Paper run:

$ python test/run_multiple.py $ python test/run_multiple_img.py

For replicating specific runs or trying out different hyperparameters use:

$ python test/main.py

and change the test/config.json file appropriately

Older Versions of this Optimizer:

https://github.com/TheMody/Faster-Convergence-for-Transformer-Fine-tuning-with-Line-Search-Methods https://github.com/TheMody/Improving-Line-Search-Methods-for-Large-Scale-Neural-Network-Training

Please cite:

No learning rates needed: Introducing SALSA - Stable ArmijoLine Search Adaptation from Philip Kenneweg, Tristan Kenneweg, Fabian Fumagalli Barbara Hammer published in IJCNN 2024 and on arvix

@INPROCEEDINGS{10650124, author={Kenneweg, Philip and Kenneweg, Tristan and Fumagalli, Fabian and Hammer, Barbara}, booktitle={2024 International Joint Conference on Neural Networks (IJCNN)}, title={No learning rates needed: Introducing SALSA - Stable Armijo Line Search Adaptation}, year={2024}, volume={}, number={}, pages={1-8}, keywords={Training;Schedules;Codes;Search methods;Source coding;Computer architecture;Transformers}, doi={10.1109/IJCNN60899.2024.10650124}}

Owner

Name: Philip Kenneweg
Login: TheMody
Kind: user
Company: University of Bielefeld

Repositories: 16
Profile: https://github.com/TheMody

Citation (CITATION.cff)

cff-version: 1.2.0
message: If you use this software, please cite both the article from preferred-citation and the software itself.
authors:
  - family-names: Kenneweg
    given-names: Philip
  - family-names: Kenneweg
    given-names: Tristan
  - family-names: Fumagalli
    given-names: Fabian
  - family-names: Hammer
    given-names: Barbara
title: 'No learning rates needed: Introducing SALSA - Stable Armijo Line Search Adaptation'
version: 1.0.0
doi: 10.1109/IJCNN60899.2024.10650124
date-released: '2024-11-18'
preferred-citation:
  authors:
    - family-names: Kenneweg
      given-names: Philip
    - family-names: Kenneweg
      given-names: Tristan
    - family-names: Fumagalli
      given-names: Fabian
    - family-names: Hammer
      given-names: Barbara
  title: 'No learning rates needed: Introducing SALSA - Stable Armijo Line Search Adaptation'
  doi: 10.1109/IJCNN60899.2024.10650124
  type: article-journal
  pages: 1-8
  year: '2024'
  conference: 2024 International Joint Conference on Neural Networks (IJCNN)
  publisher: {}

GitHub Events

Total

Watch event: 4
Push event: 22
Fork event: 1

Last Year

Watch event: 4
Push event: 22
Fork event: 1

Packages

Total packages: 1
Total downloads:
- pypi 14 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 1
Total maintainers: 1

pypi.org: salsa-optimizer

A pytorch optimizer that does not need a learning rate

Homepage: https://github.com/TheMody/No-learning-rates-needed-Introducing-SALSA-Stable-Armijo-Line-Search-Adaptation
Documentation: https://github.com/TheMody/No-learning-rates-needed-Introducing-SALSA-Stable-Armijo-Line-Search-Adaptation/wiki
License: MIT
Latest release: 0.1.0
published almost 2 years ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 14 Last month

Rankings

Dependent packages count: 10.4%

Average: 34.5%

Dependent repos count: 58.6%

Maintainers (1)

PhilipKenneweg

Last synced: 10 months ago

Dependencies

environment.yml pypi

mpmath ==1.2.1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science