leaf-aaf

Reference implementation of Learnable Extended Activation Function and the corresponding experiment

https://github.com/s-kostyuk/leaf-aaf

Science Score: 18.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.8%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Reference implementation of Learnable Extended Activation Function and the corresponding experiment

Basic Info
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 3 years ago · Last pushed over 2 years ago
Metadata Files
Readme License Citation

README.md

Learnable Extended Activation Function (LEAF) for Deep Neural Networks

Implementation of the experiment as published in the paper "Learnable Extended Activation Function for Deep Neural Networks" by Yevgeniy Bodyanskiy and Serhii Kostiuk.

Running experiments

  1. NVIDIA GPU recommended with at least 2 GiB of VRAM.
  2. Install the requirements from requirements.txt.
  3. Set CUBLAS_WORKSPACE_CONFIG=:4096:8 in the environment variables.
  4. Use the root of this repository as the current directory.
  5. Add the current directory to PYTHONPATH so it can find the modules

This repository contains a wrapper script that sets all the required environment variables: run_experiment.sh. Use the bash shell to execute the experiment using the wrapper script:

Example:

shell user@host:~/repo_path$ ./run_experiment.sh experiments/train_new_base.py

Reproducing the results from the paper

  1. Training LeNet-5 and KerasNet networks with linear units from scratch:

shell user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --opt adam --end_ep 100 --acts all_lus base user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --opt adam --end_ep 100 --acts all_lus ahaf user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --opt adam --end_ep 100 --acts all_lus ahaf --dspu4 user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --opt adam --end_ep 100 --acts all_lus leaf --p24sl user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --opt adam --end_ep 100 --acts all_lus leaf --p24sl --dspu4

  1. Training LeNet-5 and KerasNet networks with linear units from scratch:

shell user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --opt adam --end_ep 100 --acts all_bfs base user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --opt adam --end_ep 100 --acts all_bfs ahaf user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --opt adam --end_ep 100 --acts all_bfs ahaf --dspu4 user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --opt adam --end_ep 100 --acts all_bfs leaf --p24sl user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --opt adam --end_ep 100 --acts all_bfs leaf --p24sl --dspu4

  1. On stability of LEAF-as-ReLU:

shell user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --end_ep 100 --acts ReLU --net KerasNet --ds CIFAR-10 \ --opt adam leaf user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --end_ep 100 --acts ReLU --net KerasNet --ds CIFAR-10 \ --opt adam leaf --p24sl user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --end_ep 100 --acts ReLU --net KerasNet --ds CIFAR-10 \ --opt rmsprop ahaf user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --end_ep 100 --acts ReLU --net KerasNet --ds CIFAR-10 \ --opt rmsprop leaf user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --end_ep 100 --acts ReLU --net KerasNet --ds CIFAR-10 \ --opt rmsprop leaf --p24sl

Add the --wandb parameter to log the training process to Weights and Biases. Weights and Biases provides visualization of the parameter values and the gradient values during training.

  1. On the effect of synaptic weights initialization. Execute all commands below once per each of the seed values:

shell user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --seed 7823 --opt adam --ds CIFAR-10 base user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --seed 7823 --opt adam --ds CIFAR-10 ahaf user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --seed 7823 --opt adam --ds CIFAR-10 ahaf --dspu4 user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --seed 7823 --opt adam --ds CIFAR-10 leaf --p24sl user@host:~/repo_path$ ./run_experiment.sh experiments/train_individual.py \ --seed 7823 --opt adam --ds CIFAR-10 leaf --p24sl --dspu4

Seed values to evaluate: 42, 100, 128, 1999, 7823.

Visualization of experiment results

Use tools from the post_experiment directory to visualize training process, create the training result summary tables and visualize the activation function form for LEAF/AHAF compared to the corresponding base activations.

Owner

  • Name: Serhii Kostiuk
  • Login: s-kostyuk
  • Kind: user
  • Location: Ukraine
  • Company: Student of nure.ua

Student, embedded developer. Works, opinions, comments and other content are of my own.

Citation (CITATION.bib)

@article{Bodyanskiy_Kostiuk_2023,
	title        = {Learnable Extended Activation Function for Deep Neural Networks},
	author       = {Bodyanskiy, Yevgeniy and Kostiuk, Serhii},
	year         = 2023,
	month        = {Oct.},
	journal      = {International Journal of Computing},
	volume       = 22,
	number       = 3,
	pages        = {311--318},
	doi          = {10.47839/ijc.22.3.3225},
	url          = {https://computingonline.net/computing/article/view/3225},
	abstractnote = {<p>This paper introduces Learnable Extended Activation Function (LEAF) - an adaptive activation function that combines the properties of squashing functions and rectifier units. Depending on the target architecture and data processing task, LEAF adapts its form during training to achieve lower loss values and improve the training results. While not suffering from the "vanishing gradient" effect, LEAF can directly replace SiLU, ReLU, Sigmoid, Tanh, Swish, and AHAF in feed-forward, recurrent, and many other neural network architectures. The training process for LEAF features a two-stage approach when the activation function parameters update before the synaptic weights. The experimental evaluation in the image classification task shows the superior performance of LEAF compared to the non-adaptive alternatives. Particularly, LEAF-asTanh provides 7% better classification accuracy than hyperbolic tangents on the CIFAR-10 dataset. As empirically examined, LEAF-as-SiLU and LEAF-as-Sigmoid in convolutional networks tend to "evolve" into SiLU-like forms. The proposed activation function and the corresponding training algorithm are relatively simple from the computational standpoint and easily apply to existing deep neural networks.</p>}
}

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1