l0learn

Efficient Algorithms for L0 Regularized Learning

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 7 DOI reference(s) in README
○
Academic publication links
✓
Committers with academic emails
6 of 8 committers (75.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (19.2%) to scientific vocabulary

Keywords

compressed-sensing feature-selection l0-regularization l0learn machine-learning regularization sparse-modeling sparse-regression

Last synced: 6 months ago · JSON representation

Repository

Efficient Algorithms for L0 Regularized Learning

Basic Info

Host: GitHub
Owner: hazimehh
License: other
Language: C++
Default Branch: master
Homepage:
Size: 71.4 MB

Statistics

Stars: 102
Watchers: 10
Forks: 31
Open Issues: 4
Releases: 0

Topics

compressed-sensing feature-selection l0-regularization l0learn machine-learning regularization sparse-modeling sparse-regression

Created over 8 years ago · Last pushed about 2 years ago

Metadata Files

Readme Changelog License

L0Learn: Fast Best Subset Selection

Hussein Hazimeh, Rahul Mazumder, and Tim Nonet

Massachusetts Institute of Technology

Downloads from Rstudio:

Introduction

L0Learn is a highly efficient framework for solving L0-regularized learning problems. It can (approximately) solve the following three problems, where the empirical loss is penalized by combinations of the L0, L1, and L2 norms:

We support both regression (using squared error loss) and classification (using logistic or squared hinge loss). Optimization is done using coordinate descent and local combinatorial search over a grid of regularization parameter(s) values. Several computational tricks and heuristics are used to speed up the algorithms and improve the solution quality. These heuristics include warm starts, active set convergence, correlation screening, greedy cycling order, and efficient methods for updating the residuals through exploiting sparsity and problem dimensions. Moreover, we employed a new computationally efficient method for dynamically selecting the regularization parameter λ in the path. We describe the details of the algorithms in our paper: Fast Best Subset Selection: Coordinate Descent and Local Combinatorial Optimization Algorithms (link).

The toolkit is implemented in C++11 and can often run faster than popular sparse learning toolkits (see our experiments in the paper above). We also provide an easy-to-use R interface; see the section below for installation and usage of the R package.

NEW: Version 2 (03/2021) adds support for sparse matrices and box constraints on the coefficients.

R Package Installation

The latest version (v2.1.0) can be installed from CRAN as follows: {R} install.packages("L0Learn", repos = "http://cran.rstudio.com") Alternatively, L0Learn can also be installed from Github as follows: {R} library(devtools) install_github("hazimehh/L0Learn") L0Learn's changelog can be accessed from here.

Usage

For a tutorial, please refer to L0Learn's Vignette. For a detailed description of the API, check the Reference Manual.

FAQ

Which penalty to use?

Pure L0 regularization can overfit when the signal strength in the data is relatively low. Adding L2 regularization can alleviate this problem and lead to competitive models (see the experiments in our paper). Thus, in practice, we strongly recommend using the L0L2 penalty. Ideally, the parameter gamma (for L2 regularization) should be tuned over a sufficiently large interval, and this can be performed using L0Learn's built-in cross-validation method.

Which algorithm to use?

By default, L0Learn uses a coordinate descent-based algorithm, which achieves competitive run times compared to popular sparse learning toolkits. This can work well for many applications. We also offer a local search algorithm that is guaranteed to return higher quality solutions, at the expense of an increase in the run time. We recommend using the local search algorithm if the problem has highly correlated features or the number of samples is much smaller than the number of features---see the local search section of the Vignette for how to use this algorithm.

How to certify optimality?

While for many challenging statistical instances L0Learn can lead to optimal or near-optimal solutions, it cannot provide certificates of optimality. Such certificates can be provided via Integer Programming. Our toolkit L0BnB is a scalable integer programming framework for L0-regularized regression, which can provide such certificates and potentially improve upon the solutions of L0Learn (if they are sub-optimal). We recommend using L0Learn first to obtain a candidate solution (or a pool of solutions) and then checking optimality using L0BnB.

Citing L0Learn

If you find L0Learn useful in your research, please consider citing the following papers.

Paper 1 (Toolkit): @article{hazimeh2023l0learn, title={L0learn: A scalable package for sparse learning using l0 regularization}, author={Hazimeh, Hussein and Mazumder, Rahul and Nonet, Tim}, journal={Journal of Machine Learning Research}, volume={24}, number={205}, pages={1--8}, year={2023} }

Paper 2 (Regression): @article{doi:10.1287/opre.2019.1919, author = {Hazimeh, Hussein and Mazumder, Rahul}, title = {Fast Best Subset Selection: Coordinate Descent and Local Combinatorial Optimization Algorithms}, journal = {Operations Research}, volume = {68}, number = {5}, pages = {1517-1537}, year = {2020}, doi = {10.1287/opre.2019.1919}, URL = {https://doi.org/10.1287/opre.2019.1919}, eprint = {https://doi.org/10.1287/opre.2019.1919} }

Paper 3 (Classification): @article{JMLR:v22:19-1049, author = {Antoine Dedieu and Hussein Hazimeh and Rahul Mazumder}, title = {Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives}, journal = {Journal of Machine Learning Research}, year = {2021}, volume = {22}, number = {135}, pages = {1-47}, url = {http://jmlr.org/papers/v22/19-1049.html} }

Owner

Name: Hussein Hazimeh
Login: hazimehh
Kind: user
Location: United States
Company: Google

Website: http://www.mit.edu/~hazimeh/
Repositories: 4
Profile: https://github.com/hazimehh

GitHub Events

Total

Issues event: 1
Watch event: 8

Last Year

Issues event: 1
Watch event: 8

Committers

Last synced: over 2 years ago

All Time

Total Commits: 521
Total Committers: 8
Avg Commits per committer: 65.125
Development Distribution Score (DDS): 0.453

Past Year

Commits: 4
Committers: 2
Avg Commits per committer: 2.0
Development Distribution Score (DDS): 0.25

Top Committers

Name	Email	Commits
Hussein Hazimeh	hh@i****g	285
Tim Nonet	t**t@g**m	126
Hussein Hazimeh	h**h@m**u	100
Hussein Hazimeh	h**2@i**u	3
Hussein Hazimeh	h**z@g**m	3
Tim Nonet	t**t@m**u	2
Jiachang Liu	j**u@d**u	1
rahulmaz	r**z@m**u	1

Committer Domains (Top 20 + Academic)

mit.edu: 3 duke.edu: 1 illinois.edu: 1 ieee.org: 1

Issues and Pull Requests

Last synced: 7 months ago

All Time

Total issues: 18
Total pull requests: 50
Average time to close issues: about 1 month
Average time to close pull requests: 1 day
Total issue authors: 10
Total pull request authors: 3
Average comments per issue: 1.89
Average comments per pull request: 0.16
Merged pull requests: 47
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

vtshen (6)
tomwenseleers (4)
hazimehh (1)
talegari (1)
sentian (1)
bommert (1)
dliviya (1)
stephens999 (1)
baogorek (1)
wbnicholson (1)
pbreheny (1)

Pull Request Authors

hazimehh (34)
TNonet (16)
jiachangliu (1)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 2
Total downloads:
- pypi 102 last-month
- cran 648 last-month
Total docker downloads: 20,358

Total dependent packages: 1
(may contain duplicates)
Total dependent repositories: 4
(may contain duplicates)
Total versions: 12
Total maintainers: 2

cran.r-project.org: L0Learn

Fast Algorithms for Best Subset Selection

Homepage: https://github.com/hazimehh/L0Learn https://pubsonline.informs.org/doi/10.1287/opre.2019.1919
Documentation: http://cran.r-project.org/web/packages/L0Learn/L0Learn.pdf
License: MIT + file LICENSE
Latest release: 2.1.0
published almost 3 years ago

Versions: 9
Dependent Packages: 1
Dependent Repositories: 3
Downloads: 648 Last month
Docker Downloads: 20,358

Rankings

Forks count: 1.9%

Stargazers count: 4.3%

Average: 12.1%

Docker downloads count: 12.6%

Dependent repos count: 16.5%

Dependent packages count: 18.1%

Downloads: 19.2%

Maintainers (1)

husseinhaz@gmail.com

Last synced: 6 months ago

pypi.org: l0learn

L0Learn is a highly efficient framework for solving L0-regularized learning problems.

Homepage: https://github.com/hazimehh/L0Learn
Documentation: https://l0learn.readthedocs.io/
License: MIT
Latest release: 0.4.3
published about 2 years ago

Versions: 3
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 102 Last month

Rankings

Forks count: 6.4%

Stargazers count: 7.5%

Dependent packages count: 10.0%

Average: 14.9%

Dependent repos count: 21.7%

Downloads: 28.9%

Maintainers (1)

TNonet

Last synced: 7 months ago

Dependencies

DESCRIPTION cran

R >= 3.3.0 depends
MASS * imports
Matrix * imports
Rcpp >= 0.12.13 imports
ggplot2 * imports
methods * imports
reshape2 * imports
covr * suggests
knitr * suggests
pracma * suggests
raster * suggests
rmarkdown * suggests
testthat * suggests

l0learn

Science Score: 23.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

L0Learn: Fast Best Subset Selection

Hussein Hazimeh, Rahul Mazumder, and Tim Nonet

Massachusetts Institute of Technology

Introduction

R Package Installation

Usage

FAQ

Which penalty to use?

Which algorithm to use?

How to certify optimality?

Citing L0Learn

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

cran.r-project.org: L0Learn

Rankings

Maintainers (1)

pypi.org: l0learn

Rankings

Maintainers (1)

Dependencies