PartitionedKnetNLPModels.jl
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.9%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: JSO-Boneyard
- License: other
- Language: Julia
- Default Branch: master
- Size: 505 KB
Statistics
- Stars: 2
- Watchers: 3
- Forks: 2
- Open Issues: 3
- Releases: 0
Metadata Files
README.md
PartitionedKnetNLPModels : A partitioned quasi-Newton stochastic method to train partially separable neural networks
| Documentation | Linux/macOS/Windows/FreeBSD | Coverage |
|:-----------------:|:-------------------------------:|:------------:|
|
|
|
| [![doi][doi-img]][doi-url] |
⚠️ Deprecated Package
This package is currently deprecated and no further maintenance or updates are planned. If you are interested in reviving or maintaining this package, feel free to reach out — we’d be happy to discuss or support such efforts.
Motivation
The module address a partially separable loss function, such as the neural network training minimize a partially separable loss function $f: \mathbb{R}^n \to \mathbb{R}$ in the form
$$
f(x) = \sum{i=1}^N fi (Ui(x)), fi : \mathbb{R}^{ni} \to \mathbb{R}, Ui \in \mathbb{R}^{ni \times n}, ni \ll n,
$$
where: * $fi$ is the $i$-th element function whose dimension is smaller than $f$; * $Ui$ the linear operator selecting the linear combinations of variables that parametrize $f_i$.
PartitionedKnetNLPModels.jl define a stochastic trust-region method exploiting the partitioned structure of the derivatives of $f$, the gradient
$$ \nabla f(x) = \sum{i=1}^N Ui^\top \nabla \hat{f}i (Ui x), $$
and the hessian
$$ \nabla^2 f(x) = \sum{i=1}^N Ui^\top \nabla^2 \hat{fi} (Ui x) U_i, $$
are the sum of the element derivatives $\nabla \hat{f}i, \nabla^2\hat{f}i$. This structure allows to define a partitioned quasi-Newton approximation of $\nabla^2 f$
$$ B = \sum{i=1}^N Ui^\top \hat{B}{i} Ui, $$
such that each $\hat{B}i \approx \nabla^2 \hat{f}i$. Contrary to the BFGS and SR1 updates, respectively of rank 1 and 2, the rank of update $B$ is proportionnal to $\min(N,n)$.
Reference
- A. Griewank and P. Toint, Partitioned variable metric updates for large structured optimization problems, Numerische Mathematik volume, 39, pp. 119--137, 1982.
Content
PartitionedKnetNLPModels.jl define
- A new layer architecture, called "separable layer".
This layer requires : the size of the previous layer p and the next layer nl and the number of classes C
julia
separable_layer = SL(p,C,nl/C)
- A partially separable loss function PSLDP (partially separable loss determinist prediction)
- A stochastic trust region method which use a partitioned quasi-Newton linear operator to make a quadratic approximate of the PSLDP function
We assume that the reader are familiar with Knet or with neural networks, otherwise here is the Knet tutorials.
First, you have to define the architecture of your neural network. Here the PSNet architecture is made of convolution layer Conv and from separable layer SL
```julia
using PartitionedKnetNLPModels
C = 10 # number of class for MNIST layerPS = [40,20,1] PSNet = PartChainPSLDP(Conv(4,4,1,20), Conv(4,4,20,50), SL(800,C,layerPS[1]), SL(ClayerPS[1],C,layerPS[2]), SL(ClayerPS[2],C,layerPS[3];f=identity)) ```
The dataset MNIST ```julia (xtrn, ytrn) = MNIST.traindata(Float32) ytrn[ytrn.==0] .= 10 data_train = (xtrn, ytrn) # training dataset
(xtst, ytst) = MNIST.testdata(Float32)
ytst[ytst.==0] .= 10
datatest = (xtst, ytst) # testing dataset
Then, define the `PartitionedKnetNLPModel` associated
julia
nlpplbfgs = PartitionedKnetNLPModel(PSNet; name=:plbfgs, datatrain, datatest)
`nlp_plbfgs` handles: the evaluation of `PSNet` using a minibatch of the `data_train`, the explicit computation of the objective function and its derivatives
julia
n = length(nlpplbfgs.meta.x0) # size of the minimization problem
w = rand(n) # random point
f = NLPModels.obj(nlpplbfgs, w) # compute the loss function
g = NLPModels.grad(nlp_plbfgs, w) # compute the gradient of the loss function
```
From these features, PartitionedKnetNLPModels.jl define a stochastic trust region PUS (partitioned update solver) using partitioned quasi-Newton update
julia
PUS(nlp_plbfgs; max_time, max_iter)
To use other quasi-Newton approximation than PLBFGS you have to define new PartitionedKnetNLPModel with other name, similarly to
julia
nlp_plsr1 = PartitionedKnetNLPModel(PSNet; name=:plsr1, data_train, data_test)
nlp_plse = PartitionedKnetNLPModel(PSNet; name=:plse, data_train, data_test)
Dependencies
The module Knet is used to define the operators required by the neural network such as : convolution, pooling, in a way that neural network can run on a GPU.
KnetNLPModels provide an interface between a Knet neural network and the ADNLPModel.
The partitioned quasi-Newton operators used in the partially separable training are defined in PartitionedKnetNLPModels.jl.
How to install
julia> ]
pkg> add https://github.com/paraynaud/PartitionedKnetNLPModels.jl, https://github.com/paraynaud/KnetNLPModels.jl, https://github.com/paraynaud/PartitionedKnetNLPModels.jl,
pkg> test PartitionedKnetNLPModels
Owner
- Name: JSO-Boneyard
- Login: JSO-Boneyard
- Kind: organization
- Repositories: 7
- Profile: https://github.com/JSO-Boneyard
Citation (CITATION.bib)
@Misc{raynaud2022,
author = {P. Raynaud },
title = {{PartitionedKnetNLPModels.jl}:},
month = {Month},
howpublished = {\url{https://github.com/paraynaud/PartitionedKnetNLPModels.jl}},
year = {2022},
DOI = {}
}