https://github.com/beniaminogreen/zoomergp
Science Score: 46.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
✓Committers with academic emails
1 of 1 committers (100.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.3%) to scientific vocabulary
Last synced: 6 months ago
·
JSON representation
Repository
Basic Info
- Host: GitHub
- Owner: beniaminogreen
- License: gpl-3.0
- Language: Rust
- Default Branch: main
- Homepage: http://beniamino.org/zoomerGP/
- Size: 2.06 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Created 9 months ago
· Last pushed 6 months ago
Metadata Files
Readme
License
README.Rmd
---
output: github_document
always_allow_html: true
---
```{r, include=F, echo=F}
devtools::load_all()
library(tidyverse)
```
# ZoomerGP: Gaussian Process Regression in R
`zoomerGP` provides a fast and composable language to write, fit, and explore Gaussian Process models in R. The package allows users to compose a Gaussian process kernel using a consise formula syntax, and tune hyperparameters through type-2 ML or Variational Inference (SVGD). Both Gaussian and non-Gaussian Likelihoods are supported.
## Kernel formulas:
`ZoomerGP`'s flagship feature is a tidy formula interface that allows users to cleanly specify complex Gaussian-process models. As an example, you can specify and fit a Gaussian Process with an RBF covariance function to a dataset with the following formula:
```{r, eval=F}
gaussian_process(y ~ rbf(x), data = data)
```
This syntax also allows you to arbitrarily combine different kernels buy adding and multiplying them. As an example, one can try to fit a periodic trend that changes over time and a long-term trend to the data by using the following code:
```{r, eval=F}
gaussian_process(y ~ periodic(x) * rbf(x) + rbf(x), data = data)
```
Each kernel can also act on multiple variables, as shown below:
```{r, eval =F}
gaussian_process(a ~ periodic(b,c) * rbf(d,e,f,g) + rbf(h), data = data)
```
## Avaliable Kernels:
At present the following kernels are implemented:
| Kernel | Function Name | Notes |
| :--------------------------|:-------------------:|:-----:|
| Squared Exponential | rbf() | Results in a smooth, infinitely-differentiable posterior. Too smooth for most applications |
| Spectral Mixture | spectral[0-5]() | Spectal mixture kernel with 1,2,3,4, or 5 components. Can recover any stationary covariance function but is very expensive to fit. |
| Linear | linear() | Recovers a linear trend |
| Indicator | indicator() | Equal to $sigma^2$ if $x=x'$, zero otherwise. Useful when accounting for clustering. |
| If / mask | mask() | Equal of one if both $x$ and $x'$ are equal to one, zero otherwise. Most useful when 'masking' out another kernel so it only acts on certain pairs of inputs |
| Periodic | periodic() | Periodic similarity function. Can accept as a keyword argument a vector of periods associated with each dimension. |
## Fitting to Large Datasets with Sparse Approximations.
Gaussian Process Regression are computationally intensive, and generally take
$O(n^3)$ computations to fit. To allow the method to scale to large datasets,
this package implements the projected process approximation as described in
Rassmusen and Williams's (2005) Gaussian Processes for Machine Learning. To
turn this approximation on, use the `sparse=T` argument when fitting the
gaussian processes please be aware that you may also need to change
the default value of the number of inducing points (`n_points`) for numerical
stability.
## Optimization Methods
Currently, we the package supports 3 engines for hyperparameter optimization.
`bfgs` is selected by default, but these can be changed using the
training_method argument to `gaussian_process` function
* `bfgs` - as implimented in the optim package in R. Will sometimes fail to
coverge, leading to an error.
* `rgenoud` - as implimented in the `rgenoud package.` This is likely the most
robust optimizer as it avoids getting stuck in local minima, but also the
most computationally intensive.
* `coin` - a learning-rate free algorithm that is implimented in Rust. Seems to
work well across a variety of problems, but can get stuck in local minima.
## Fitting Non-Gaussian Likelihoods with Variational Inference
`zoomerGP` also implements a variant of the [Stein Variational Gaussian Processes](https://arxiv.org/abs/2009.12141) algorithm to perform variational inference over hyperparameters and latent function values when the likelihood is non-Gaussian. Specifically, we use the [Coin Sampling](https://arxiv.org/abs/2301.11294) algorithm proposed by the same authors to provide a robust Variational Inference engine that does not require tuning by the user.
At the moment, inference for the following non-Gaussian likelihoods are implemented:
- logistic regression (via `the logistic_gaussian_process` function).
# Installation Instructions
## Installing Rust
If your operating system or version of R is not installed, you must have the
[Rust compiler](https://www.rust-lang.org/tools/install) installed to compile
this package from sources. After the package is compiled, Rust is no longer
required, and can be safely uninstalled.
#### Installing Rust on Linux or Mac:
To install Rust on Linux or Mac, you can simply run the following snippet in
your terminal.
``` sh
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
```
#### Installing Rust on Windows:
To install Rust on windows, you can use the Rust installation wizard,
`rustup-init.exe`, found [at this
site](https://forge.rust-lang.org/infra/other-installation-methods.html).
Depending on your version of Windows, you may see an error that looks something like this:
```
error: toolchain 'stable-x86_64-pc-windows-gnu' is not installed
```
In this case, you should run `rustup install stable-x86_64-pc_windows-gnu` to
install the missing toolchain. If you're missing another toolchain, simply type
this in the place of `stable-x86_64-pc_windows-gnu` in the command above.
### Installing Package from Github:
Once you have rust installed Rust, you should be able to install the package
with either the install.packages function as above, or using the
`install_github` function from the `devtools` package or with the `pkg_install`
function from the `pak` package.
``` r
## Install with devtools
# install.packages("devtools")
devtools::install_github("beniaminogreen/GPR")
## Install with pak
# install.packages("pak")
pak::pkg_install("beniaminogreen/GPR")
```
### Loading The Package
Once the package is installed, you can load it into memory as usual by typing:
```{r, warning = FALSE, message = FALSE, eval = F}
library(zoomerGP)
```
Owner
- Name: Beniamino Green
- Login: beniaminogreen
- Kind: user
- Location: New Haven, CT
- Company: Yale University
- Repositories: 7
- Profile: https://github.com/beniaminogreen
Pre-doctoral Fellow
GitHub Events
Total
- Push event: 5
Last Year
- Push event: 5
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Beniamino Green | b****n@y****u | 31 |
Committer Domains (Top 20 + Academic)
yale.edu: 1
Issues and Pull Requests
Last synced: 7 months ago
Dependencies
.github/workflows/pkgdown.yaml
actions
- JamesIves/github-pages-deploy-action v4.5.0 composite
- actions/checkout v4 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
src/rust/Cargo.lock
cargo
- approx 0.5.1
- autocfg 1.4.0
- build-print 0.1.1
- bumpalo 3.17.0
- cauchy 0.4.0
- cblas-sys 0.1.4
- cfg-if 1.0.0
- console 0.15.11
- crossbeam-deque 0.8.6
- crossbeam-epoch 0.9.18
- crossbeam-utils 0.8.21
- either 1.15.0
- encode_unicode 1.0.0
- extendr-api 0.8.0
- extendr-ffi 0.8.0
- extendr-macros 0.8.0
- getrandom 0.2.16
- indicatif 0.17.11
- js-sys 0.3.77
- katexit 0.1.5
- lapack-sys 0.14.0
- lax 0.17.0
- libc 0.2.172
- log 0.4.27
- matrixmultiply 0.3.9
- ndarray 0.16.1
- ndarray-linalg 0.17.0
- num-complex 0.4.6
- num-integer 0.1.46
- num-traits 0.2.19
- number_prefix 0.4.0
- once_cell 1.21.3
- paste 1.0.15
- portable-atomic 1.11.0
- portable-atomic-util 0.2.4
- ppv-lite86 0.2.21
- proc-macro2 1.0.95
- quote 1.0.40
- rand 0.8.5
- rand_chacha 0.3.1
- rand_core 0.6.4
- rawpointer 0.2.1
- rayon 1.10.0
- rayon-core 1.12.1
- serde 1.0.219
- serde_derive 1.0.219
- syn 2.0.101
- thiserror 2.0.12
- thiserror-impl 2.0.12
- unicode-ident 1.0.18
- unicode-width 0.2.0
- wasi 0.11.0+wasi-snapshot-preview1
- wasm-bindgen 0.2.100
- wasm-bindgen-backend 0.2.100
- wasm-bindgen-macro 0.2.100
- wasm-bindgen-macro-support 0.2.100
- wasm-bindgen-shared 0.2.100
- web-time 1.1.0
- windows-sys 0.59.0
- windows-targets 0.52.6
- windows_aarch64_gnullvm 0.52.6
- windows_aarch64_msvc 0.52.6
- windows_i686_gnu 0.52.6
- windows_i686_gnullvm 0.52.6
- windows_i686_msvc 0.52.6
- windows_x86_64_gnu 0.52.6
- windows_x86_64_gnullvm 0.52.6
- windows_x86_64_msvc 0.52.6
- zerocopy 0.8.25
- zerocopy-derive 0.8.25
src/rust/Cargo.toml
cargo
DESCRIPTION
cran
- R >= 4.2 depends
- memoise * imports
- rgenoud * imports
- rlang * imports
- tidyselect * imports
- ggplot2 * suggests
- knitr * suggests
- nlopt * suggests
- rmarkdown * suggests
- testthat >= 3.0.0 suggests
- tibble * suggests