https://github.com/cvxgrp/randalo

https://github.com/cvxgrp/randalo

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.0%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: cvxgrp
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 659 KB
Statistics
  • Stars: 4
  • Watchers: 3
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 3 years ago · Last pushed about 1 year ago
Metadata Files
Readme License

README.md

RandALO: fast randomized risk estimation for high-dimensional data

This repository contains a software package implementing RandALO, a fast randomized method for risk estimation of machine learning models, as described in the paper,

P. T. Nobel, D. LeJeune, E. J. Candès. RandALO: Out-of-sample risk estimation in no time flat. 2024. arXiv:2409.09781.

Note: the experiments in the paper were performed in an earlier version of the codebase available in the paper-code branch.

Installation

In a folder run the following:

```bash git clone git@github.com:cvxgrp/randalo.git cd randalo

create a new environment with Python >= 3.10 (could also use venv or similar)

conda create -n randalo python=3.12

install requirements and randalo

pip install -r requirements.txt ```

Usage

Scikit-learn

The simplest way to use RandALO is with linear models from scikit-learn. See a longer demonstration in a notebook here.

```python from torch import nn from sklearn.linear_model import Lasso from randalo import RandALO

X, y = ... # load data as np.ndarrays as usual

model = Lasso(1.0).fit(X, y) # fit the model alo = RandALO.fromsklearn(model, X, y) # set up the Jacobian mseestimate = alo.evaluate(nn.MSELoss()) # estimate risk ```

We currently support the following models:

  • LinearRegression
  • Ridge
  • Lasso
  • LassoLars
  • ElasticNet
  • LogisticRegression

Linear models with any solver

If you prefer to use other solvers for fitting your models than scikit-learn, or if you wish to extend to other models than the ones listed above, you can still use RandALO by instantiating the Jacobian yourself. You need only be careful to ensure that you scale the regularizer correctly for your problem formulation.

```python from torch import nn from sklearn.linear_model import Lasso from randalo import RandALO, MSELoss, L1Regularizer, Jacobian

X, y = ... # load data as np.ndarrays as usual

model = Lasso(1.0).fit(X, y) # fit the model

instantiate RandALO by creating a Jacobian object

loss = MSELoss() reg = 2.0 * model.alpha * L1Regularizer() # scale the regularizer appropriately yhat = model.predict(X) solutionfunc = lambda: model.coef_ jac = Jacobian(y, X, solutionfunc, loss, reg) alo = RandALO(loss, jac, y, yhat)

mse_estimate = alo.evaluate(nn.MSELoss()) # estimate risk ```

Please refer to our scikit-learn integration source code for more examples.

Owner

  • Name: Stanford University Convex Optimization Group
  • Login: cvxgrp
  • Kind: organization
  • Location: Stanford, CA

GitHub Events

Total
  • Watch event: 3
  • Push event: 49
  • Create event: 1
Last Year
  • Watch event: 3
  • Push event: 49
  • Create event: 1

Issues and Pull Requests

Last synced: about 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: less than a minute
  • Total issue authors: 0
  • Total pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: less than a minute
  • Issue authors: 0
  • Pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • PTNobel (3)
  • dlej (2)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

.github/workflows/build.yml actions
  • actions/checkout v4 composite
  • actions/download-artifact v4 composite
  • actions/setup-python v5 composite
  • actions/upload-artifact v4 composite
  • pypa/gh-action-pypi-publish release/v1 composite
pyproject.toml pypi
requirements.txt pypi
  • cvxpy *
  • cvxpylayers *
  • matplotlib *
  • numpy ==2.1.1
  • pandas *
  • scikit-learn *
  • scipy *
  • torch *
  • tqdm *
setup.py pypi
  • numpy *
  • scipy *
  • torch *
  • torch-linops *