af_replacement

Experiments the supporting code for the "Using Adaptive Activation Functions in Pre-Trained Artificial Neural Network Models" paper

https://github.com/s-kostyuk/af_replacement

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 8 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary

Last synced: 8 months ago · JSON representation ·

Repository

Experiments the supporting code for the "Using Adaptive Activation Functions in Pre-Trained Artificial Neural Network Models" paper

Basic Info

Host: GitHub
Owner: s-kostyuk
License: mit
Language: Python
Default Branch: main
Size: 27.3 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 1

Created about 3 years ago · Last pushed over 2 years ago

Metadata Files

Readme License Citation

Using adaptive activation functions in pre-trained artificial neural network models

Implementation of the experiment as published in the paper "Using adaptive activation functions in pre-trained artificial neural network models" by Yevgeniy Bodyanskiy and Serhii Kostiuk.

Goals of the experiment

The experiment:

demonstrates the method of activation function replacement in pre-trained models using the VGG-like KerasNet [^1] CNN as the example base model;
evaluates the inference result differences between the base pre-trained model and the same model with replaced activation functions;
demonstrates the effectiveness of activation function fine-tuning when all other elements of the model are fixed (frozen);
evaluates performance of the KerasNet variants with different activation functions (adaptive and non-adaptive) trained in different regimes.

Description of the experiment

The experiment consists of the following steps:

Train the base KerasNet network on the CIFAR-10 [^2] dataset for 100 epochs using the standard training procedure and RMSprop. 4 variants of the network: are trained: with ReLU [^3], SiLU [^4], Tanh and Sigmoid [^5] activation functions.
Save the pre-trained network.
Evaluate performance of the base pre-trained network on the test set of CIFAR-10.
Load the base pre-trained network and replace all activation functions with the corresponding adaptive alternatives (ReLU, SiLU -> AHAF [^6]; Sigmoid, Tanh -> F-Neuron Activation [^7]).
Evaluate performance of the base derived network on the test set of CIFAR-10.
Fine-tune the adaptive activation functions on the CIFAR-10 dataset.
Evaluate the network performance after the activation function fine-tuning.
Compare the evaluation results collected on steps 3, 5 and 7.

Running experiments

NVIDIA GPU recommended with at least 2 GiB of VRAM.
Install the requirements from requirements.txt.
Set CUBLAS_WORKSPACE_CONFIG=:4096:8 in the environment variables.
Use the root of this repository as the current directory.
Add the current directory to PYTHONPATH so it can find the modules

Example:

shell user@host:~/repo_path$ export CUBLAS_WORKSPACE_CONFIG=:4096:8 user@host:~/repo_path$ export PYTHONPATH=".:$PYTHONPATH" user@host:~/repo_path$ python3 experiments/train_new_base.py

Or in a single line, to keep assignments local to the executable:

shell user@host:~/repo_path$ CUBLAS_WORKSPACE_CONFIG=:4096:8 PYTHONPATH=".:$PYTHONPATH" python3 experiments/train_new_base.py

References

[^1]: Chollet, F., et al. (2015) Train a simple deep CNN on the CIFAR10 small images dataset. https://github.com/keras-team/keras/blob/1.2.2/examples/cifar10_cnn.py

[^2]: Krizhevsky, A. (2009) Learning Multiple Layers of Features from Tiny Images. Technical Report TR-2009, University of Toronto, Toronto.

[^3]: Agarap, A. F. (2018). Deep Learning using Rectified Linear Units (ReLU). https://doi.org/10.48550/ARXIV.1803.08375

[^4]: Elfwing, S., Uchibe, E., & Doya, K. (2017). Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning. CoRR, abs/1702.03118. Retrieved from http://arxiv.org/abs/1702.03118

[^5]: Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signal Systems 2, 303–314 (1989). https://doi.org/10.1007/BF02551274

[^6]: Bodyanskiy, Y., & Kostiuk, S. (2022). Adaptive hybrid activation function for deep neural networks. In System research and information technologies (Issue 1, pp. 87–96). Kyiv Politechnic Institute. https://doi.org/10.20535/srit.2308-8893.2022.1.07

[^7]: Bodyanskiy, Y., & Kostiuk, S. (2022). Deep neural network based on F-neurons and its learning. Research Square Platform LLC. https://doi.org/10.21203/rs.3.rs-2032768/v1

Owner

Name: Serhii Kostiuk
Login: s-kostyuk
Kind: user
Location: Ukraine
Company: Student of nure.ua

Website: https://www.linkedin.com/in/skostyuk
Repositories: 1
Profile: https://github.com/s-kostyuk

Student, embedded developer. Works, opinions, comments and other content are of my own.

Citation (CITATION.cff)

cff-version: 1.2.0
title: Using Adaptive Activation Functions in Pre-Trained Artificial Neural Network Models - Reference Implementation
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - family-names: Kostiuk
    given-names: Serhii
    orcid: 'https://orcid.org/0000-0003-4196-2524'
repository-code: 'https://github.com/s-kostyuk/af_replacement'
license: MIT
commit: 18cd23958fbb07f5c9addc9cf7409d2c0ffaab2d
doi: "10.5281/zenodo.8296386"
date-released: '2023-09-29'
preferred-citation:
  type: conference-paper
  authors:
  - family-names: "Bodyanskiy"
    given-names: "Yevgeniy"
    orcid: "https://orcid.org/0000-0000-0000-0000"
  - family-names: "Kostiuk"
    given-names: "Serhii"
    orcid: "https://orcid.org/0000-0003-4196-2524"
  title: "Using Adaptive Activation Functions in Pre-Trained Artificial Neural Network Models"
  collection-title: "Proceedings of the 11th International Scientific and Practical Conference “Information Control Systems & Technologies” (ICST2023)"
  year: 2023
  volume: 3513
  conference:
    name: "CEUR Workshop Proceedings"
  url: "https://ceur-ws.org/Vol-3513/paper08.pdf"
  start: 91 # First page number
  end: 105 # Last page number
  publisher:
    name: "CEUR-WS.org"

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science