quasinewton.jl
A collection of Newton-type optimization algorithms implemented in Julia.
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: springer.com -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.8%) to scientific vocabulary
Keywords
Repository
A collection of Newton-type optimization algorithms implemented in Julia.
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
QuasiNewton.jl
A collection of quasi-Newton optimization algorithms.
Author: Cooper Simpson
DISCLAIMER: This package is still very much a work in progress and certainly not everything has been tested thoroughly. For example, the AD functionality has only been tested (in a limited capacity) with the Enzyme.jl backend.
License & Citation
All source code is made available under an MIT license. You can freely use and modify the code, without warranty, so long as you provide attribution to the authors. See LICENSE for the full text.
This repository can be cited using the GitHub action in the sidebar, or using the metadata in CITATION.cff. See Publications for a full list of publications related to R-SFN and influencing this package. If any of these are useful to your own work, please cite them individually.
Contributing
Feel free to open issues, ask questions, or otherwise contribute!
Methods
Our use of the term quasi should not be taken in the traditional sense, i.e. only in reference to methods such as BFGS, but instead via its true definition to describe algorithms that are similar to or resemble Newton's method. As such, our algorithms are intended for second-order unconstrained convex or non-convex optimization, i.e. we consider problems of the following form
math
\min_{\mathbf{x}\in \mathbb{R}^n}f(\mathbf{x})
where $f:\mathbb{R}^n\to\mathbb{R}$ is a twice continuously differentiable function, possibly with a Lipschitz continuous Hessian. We then consider updates of the following form:
math
\mathbf{x}_{(k+1)} = \mathbf{x}_{(k)} - \eta_{(k)}\mathbf{B}_{(k)}^{-1}\nabla f(\mathbf{x}_{(k)})
where $\mathbf{B}_{(k)}$ is some matrix constructed as a function of the Hessian $\mathbf{H}_{(k)} = \nabla^2 f(\mathbf{x}_{(k)})$, recovering different algorithms.
All algorithms are implemented matrix-free using Krylov subspace methods to compute the update and can leverage mixed-mode automatic differentiation to compute cheap Hessian-vector products.
(Regularized) Newton
math
\mathbf{B}_{(k)} = \mathbf{H}_{(k)} + \lambda_{(k)}\mathbf{I}
where choosing $\lambda_{(k)}=0$ recovers vanilla Newton's method and $\lambda_{(k)}\propto\|\nabla f(\mathbf{x}_{(k)})\|^{1/2}$ recovers the regularized Newton's method.
Regularized Saddle-Free Newton (R-SFN)
math
\mathbf{B}_{(k)} = \left(\mathbf{H}_{(k)}^2 + \lambda_{(k)}\mathbf{I}\right)^{1/2}
where the regularization term is $\lambda_{(k)}\propto\|\nabla f(\mathbf{x}_{(k)})\|$. The matrix function in question is a smooth approximation to the absolute value and is computed via the Lanczos process. See Publications for more details.
Adaptive Regularization with Cubics (ARC)
math
\mathbf{B}_{(k)} = \mathbf{H}_{(k)} + \lambda_{(k)}\mathbf{I}
where the analytic choice of regularization is $\lambda_{(k)}\propto\|\mathbf{x}_{(k+1)} - \mathbf{x}_{(k)}\|$, but in practice this is chosen adaptively. This particular implementation is based of ARCqK.
Installation
This package can be installed just like any other Julia package. From the terminal, after starting the Julia REPL, run the following:
julia
using Pkg
Pkg.add("QuasiNewton")
which will install the package and its direct dependencies. For automatic differentiation (AD), we use DifferentiationInterface.jl, which is agnostic to the backend. Backends such as Enzyme.jl or Zygote.jl can be installed by the user and specified in the optimization call. See JuliaDiff for current details on the Julia AD ecosystem.
Testing
To test the package, run the following command in the REPL:
julia
using Pkg
Pkg.test(test_args=[<"optional specific tests">], coverage=true)
Usage
Loading the package as usual
julia
using QuasiNewton
will export four in-place optimization routines: optimize!, newton!, rsfn!, and arc!. We will only cover the details of the first, as it covers the others as well. The optimization algorithm can be specified via a symbol and depending on the function information provided, the method is dispatched to an AD or LinearOperator.jl based execution. All methods return a Stats object containing relevant information about the procedure, with some information only being stored if history=true.
First, we consider the following example of minimizing a simple least-squares problem with Newton's method. ```julia using LinearAlgebra #for norm using LinearOperators #for Hessian
d = 100 #problem dimension
A = randn(d,d) #random data b = A*randn(d) #random rhs
f(x) = 0.5norm(Ax-b)^2 #objective function function fg!(g, x) #computes gradient in-place and returns objective function g .= A'(Ax - b) return f(x) end H(x) = LinearOperator(A'*A) #computes Hessian
stats = optimize!(randn(d), f, fg!, H, Val(:newton); itmax=100, time_limit=60, posdef=true, M=0.0)
show(stats)
=
Converged: true
Iterations: 1
Function Evals: 3
Hvp Evals: 151
Run Time (s): 3.59e-02
Minimum: 1.600e-12
Gradient Norm: 2.259e-05
Max/Avg. Residual Norm: 0.000e+00, 0.000e+00
Max/Avg. Regularization: 0.000e+00, 0.000e+00
Avg. Krylov Iterations: 1.510e+02
Status:
=#
```
Second, we consider the following example of minimizing the non-convex Rosenbrock function with R-SFN. ```julia using DifferentiationInterface #for AD import Enzyme #for AD
function rosenbrock(x) #objective function res = 0.0 for i = 1:size(x,1)-1 res += 100*(x[i+1]-x[i]^2)^2 + (1-x[i])^2 end
return res
end
stats = optimize!(zeros(10), rosenbrock, Val(:rsfn), AutoEnzyme(); itmax=100, time_limit=60, M=1e-8)
show(stats)
=
Converged: true Iterations: 73 Function Evals: 74 Hvp Evals: 353 Run Time (s): 3.30e-04 Minimum: 6.415e-12 Gradient Norm: 1.458e-05 Max/Avg. Residual Norm: 7.101e+00, 5.267e-01 Max/Avg. Regularization: 1.288e+02, 1.264e+01 Avg. Krylov Iterations: 4.836e+00 Status: =# ```
Publications
More information about R-SFN in can be found in the following publications. Please cite them if they are useful for your own work!
Regularized Saddle-Free Newton: Saddle Avoidance and Efficient Implementation
bibtex
@mastersthesis{rsfn,
title = {{Regularized Saddle-Free Newton: Saddle Avoidance and Efficient Implementation}},
author = {Cooper Simpson},
school = {Dept. of Applied Mathematics, CU Boulder},
year = {2022},
type = {{M.S.} Thesis}
}
Owner
- Name: Cooper Simpson
- Login: RS-Coop
- Kind: user
- Website: rs-coop.github.io
- Repositories: 3
- Profile: https://github.com/RS-Coop
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Simpson" given-names: "Cooper" orcid: "https://orcid.org/0000-0003-3922-2797" affiliation: "Department of Applied Mathematics, University of Washington Seattle, Washington, USA" title: "QuasiNewton.jl" version: 0.0.0 license: MIT date-released: 2023 url: "https://github.com/rs-coop/QuasiNewton.jl" note: "A collection of Newton-type optimization algorithms implemented in Julia."
GitHub Events
Total
- Delete event: 2
- Push event: 62
- Create event: 3
Last Year
- Delete event: 2
- Push event: 62
- Create event: 3