https://github.com/beniaminogreen/zoomergp

https://github.com/beniaminogreen/zoomergp

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    1 of 1 committers (100.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

Basic Info
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 9 months ago · Last pushed 6 months ago
Metadata Files
Readme License

README.Rmd

---
output: github_document
always_allow_html: true
---

```{r, include=F, echo=F}
devtools::load_all()
library(tidyverse)
```

# ZoomerGP: Gaussian Process Regression in R

`zoomerGP` provides a fast and composable language to write, fit, and explore Gaussian Process models in R. The package allows users to compose a Gaussian process kernel using a consise formula syntax, and tune hyperparameters through type-2 ML or Variational Inference (SVGD). Both Gaussian and non-Gaussian Likelihoods are supported. 

## Kernel formulas:

`ZoomerGP`'s flagship feature is a tidy formula interface that allows users to cleanly specify complex Gaussian-process models.  As an example, you can specify and fit a Gaussian Process with an RBF covariance function to a dataset with the following formula:

```{r, eval=F}
gaussian_process(y ~ rbf(x), data = data)
```

This syntax also allows you to arbitrarily combine different kernels buy adding and multiplying them. As an example, one can try to fit a periodic trend that changes over time and a long-term trend to the data by using the following code: 


```{r, eval=F}
gaussian_process(y ~ periodic(x) * rbf(x) + rbf(x), data = data)
```

Each kernel can also act on multiple variables, as shown below: 

```{r, eval =F}
gaussian_process(a ~ periodic(b,c) * rbf(d,e,f,g) + rbf(h), data = data)
```

## Avaliable Kernels: 

At present the following kernels are implemented: 


  | Kernel                     | Function Name            | Notes | 
| :--------------------------|:-------------------:|:-----:|
| Squared Exponential        | rbf()               | Results in a smooth, infinitely-differentiable posterior. Too smooth for most applications |
| Spectral Mixture           | spectral[0-5]()     | Spectal mixture kernel with 1,2,3,4, or 5 components. Can recover any stationary covariance function but is very expensive to fit. |
| Linear                     | linear()            | Recovers a linear trend |
| Indicator                  | indicator()         | Equal to $sigma^2$ if $x=x'$, zero otherwise. Useful when accounting for clustering.  |
| If / mask                         | mask()                | Equal of one if both $x$ and $x'$ are equal to one, zero otherwise. Most useful when 'masking' out another kernel so it only acts on certain pairs of inputs  |
| Periodic                        | periodic()                | Periodic similarity function. Can accept as a keyword argument a vector of periods associated with each dimension.  |


## Fitting to Large Datasets with Sparse Approximations.

Gaussian Process Regression are computationally intensive, and generally take
$O(n^3)$ computations to fit. To allow the method to scale to large datasets,
this package implements the projected process approximation as described in
Rassmusen and Williams's (2005) Gaussian Processes for Machine Learning. To
turn this approximation on, use the `sparse=T` argument when fitting the
gaussian processes please be aware that you may also need to change
the default value of the number of inducing points (`n_points`) for numerical
stability.

## Optimization Methods

Currently, we the package supports 3 engines for hyperparameter optimization.
`bfgs` is selected by default, but these can be changed using the
training_method argument to `gaussian_process` function

* `bfgs` - as implimented in the optim package in R. Will sometimes fail to
  coverge, leading to an error.
* `rgenoud` - as implimented in the `rgenoud package.` This is likely the most
  robust optimizer as it avoids getting stuck in local minima, but also the
  most computationally intensive.
* `coin` - a learning-rate free algorithm that is implimented in Rust. Seems to
  work well across a variety of problems, but can get stuck in local minima.

## Fitting Non-Gaussian Likelihoods with Variational Inference

`zoomerGP` also implements a variant of the [Stein Variational Gaussian Processes](https://arxiv.org/abs/2009.12141) algorithm to perform variational inference over hyperparameters and latent function values when the likelihood is non-Gaussian. Specifically, we use the [Coin Sampling](https://arxiv.org/abs/2301.11294) algorithm proposed by the same authors to provide a robust Variational Inference engine that does not require tuning by the user. 

At the moment, inference for the following non-Gaussian likelihoods are implemented: 

- logistic regression (via `the logistic_gaussian_process` function).

# Installation Instructions

## Installing Rust

If your operating system or version of R is not installed, you must have the
[Rust compiler](https://www.rust-lang.org/tools/install) installed to compile
this package from sources. After the package is compiled, Rust is no longer
required, and can be safely uninstalled.

#### Installing Rust on Linux or Mac:

To install Rust on Linux or Mac, you can simply run the following snippet in
your terminal.

``` sh
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
```

#### Installing Rust on Windows:

To install Rust on windows, you can use the Rust installation wizard,
`rustup-init.exe`, found [at this
site](https://forge.rust-lang.org/infra/other-installation-methods.html).
Depending on your version of Windows, you may see an error that looks something like this:

```
error: toolchain 'stable-x86_64-pc-windows-gnu' is not installed
```

In this case, you should run `rustup install stable-x86_64-pc_windows-gnu` to
install the missing toolchain. If you're missing another toolchain, simply type
this in the place of `stable-x86_64-pc_windows-gnu` in the command above.

### Installing Package from Github:

Once you have rust installed Rust, you should be able to install the package
with either the install.packages function as above, or using the
`install_github` function from the `devtools` package or with the `pkg_install`
function from the `pak` package.

``` r
## Install with devtools
# install.packages("devtools")
devtools::install_github("beniaminogreen/GPR")

## Install with pak
# install.packages("pak")
pak::pkg_install("beniaminogreen/GPR")
```

### Loading The Package

Once the package is installed, you can load it into memory as usual by typing:

```{r, warning = FALSE, message = FALSE, eval = F}
library(zoomerGP)
```

Owner

  • Name: Beniamino Green
  • Login: beniaminogreen
  • Kind: user
  • Location: New Haven, CT
  • Company: Yale University

Pre-doctoral Fellow

GitHub Events

Total
  • Push event: 5
Last Year
  • Push event: 5

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 31
  • Total Committers: 1
  • Avg Commits per committer: 31.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 31
  • Committers: 1
  • Avg Commits per committer: 31.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Beniamino Green b****n@y****u 31
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago


Dependencies

.github/workflows/pkgdown.yaml actions
  • JamesIves/github-pages-deploy-action v4.5.0 composite
  • actions/checkout v4 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
src/rust/Cargo.lock cargo
  • approx 0.5.1
  • autocfg 1.4.0
  • build-print 0.1.1
  • bumpalo 3.17.0
  • cauchy 0.4.0
  • cblas-sys 0.1.4
  • cfg-if 1.0.0
  • console 0.15.11
  • crossbeam-deque 0.8.6
  • crossbeam-epoch 0.9.18
  • crossbeam-utils 0.8.21
  • either 1.15.0
  • encode_unicode 1.0.0
  • extendr-api 0.8.0
  • extendr-ffi 0.8.0
  • extendr-macros 0.8.0
  • getrandom 0.2.16
  • indicatif 0.17.11
  • js-sys 0.3.77
  • katexit 0.1.5
  • lapack-sys 0.14.0
  • lax 0.17.0
  • libc 0.2.172
  • log 0.4.27
  • matrixmultiply 0.3.9
  • ndarray 0.16.1
  • ndarray-linalg 0.17.0
  • num-complex 0.4.6
  • num-integer 0.1.46
  • num-traits 0.2.19
  • number_prefix 0.4.0
  • once_cell 1.21.3
  • paste 1.0.15
  • portable-atomic 1.11.0
  • portable-atomic-util 0.2.4
  • ppv-lite86 0.2.21
  • proc-macro2 1.0.95
  • quote 1.0.40
  • rand 0.8.5
  • rand_chacha 0.3.1
  • rand_core 0.6.4
  • rawpointer 0.2.1
  • rayon 1.10.0
  • rayon-core 1.12.1
  • serde 1.0.219
  • serde_derive 1.0.219
  • syn 2.0.101
  • thiserror 2.0.12
  • thiserror-impl 2.0.12
  • unicode-ident 1.0.18
  • unicode-width 0.2.0
  • wasi 0.11.0+wasi-snapshot-preview1
  • wasm-bindgen 0.2.100
  • wasm-bindgen-backend 0.2.100
  • wasm-bindgen-macro 0.2.100
  • wasm-bindgen-macro-support 0.2.100
  • wasm-bindgen-shared 0.2.100
  • web-time 1.1.0
  • windows-sys 0.59.0
  • windows-targets 0.52.6
  • windows_aarch64_gnullvm 0.52.6
  • windows_aarch64_msvc 0.52.6
  • windows_i686_gnu 0.52.6
  • windows_i686_gnullvm 0.52.6
  • windows_i686_msvc 0.52.6
  • windows_x86_64_gnu 0.52.6
  • windows_x86_64_gnullvm 0.52.6
  • windows_x86_64_msvc 0.52.6
  • zerocopy 0.8.25
  • zerocopy-derive 0.8.25
src/rust/Cargo.toml cargo
DESCRIPTION cran
  • R >= 4.2 depends
  • memoise * imports
  • rgenoud * imports
  • rlang * imports
  • tidyselect * imports
  • ggplot2 * suggests
  • knitr * suggests
  • nlopt * suggests
  • rmarkdown * suggests
  • testthat >= 3.0.0 suggests
  • tibble * suggests