hal9001

hal9001: Scalable highly adaptive lasso regression in R - Published in JOSS (2020)

https://github.com/tlverse/hal9001

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 13 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Committers with academic emails
    3 of 10 committers (30.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

cross-validation lasso-regression machine-learning-algorithms nonparametric-regression

Keywords from Contributors

causal-inference conditional-density-estimates density-estimation highly-adaptive-lasso inverse-probability-weights propensity-score
Last synced: 4 months ago · JSON representation

Repository

🤠 📿 The Highly Adaptive Lasso

Basic Info
Statistics
  • Stars: 49
  • Watchers: 4
  • Forks: 14
  • Open Issues: 7
  • Releases: 5
Topics
cross-validation lasso-regression machine-learning-algorithms nonparametric-regression
Created over 8 years ago · Last pushed about 1 year ago
Metadata Files
Readme Changelog Contributing License

README.Rmd

---
output:
  rmarkdown::github_document
bibliography: "inst/REFERENCES.bib"
---



```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
```

# R/`hal9001`

[![R-CMD-check](https://github.com/tlverse/hal9001/workflows/R-CMD-check/badge.svg)](https://github.com/tlverse/hal9001/actions)
[![Coverage Status](https://codecov.io/gh/tlverse/hal9001/branch/master/graph/badge.svg)](https://app.codecov.io/gh/tlverse/hal9001)
[![CRAN](https://www.r-pkg.org/badges/version/hal9001)](https://www.r-pkg.org/pkg/hal9001)
[![CRAN downloads](https://cranlogs.r-pkg.org/badges/hal9001)](https://CRAN.R-project.org/package=hal9001)
[![CRAN total downloads](http://cranlogs.r-pkg.org/badges/grand-total/hal9001)](https://CRAN.R-project.org/package=hal9001)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![License: GPL v3](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](http://www.gnu.org/licenses/gpl-3.0)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3558313.svg)](https://doi.org/10.5281/zenodo.3558313)
[![DOI](https://joss.theoj.org/papers/10.21105/joss.02526/status.svg)](https://doi.org/10.21105/joss.02526)

> The _Scalable_ Highly Adaptive Lasso

__Authors:__ [Jeremy Coyle](https://github.com/tlverse), [Nima
Hejazi](https://nimahejazi.org), [Rachael
Phillips](https://github.com/rachaelvp), [Lars van der
Laan](https://github.com/Larsvanderlaan), and [Mark van der
Laan](https://vanderlaan-lab.org/)

---

## What's `hal9001`?

`hal9001` is an R package providing an implementation of the scalable _highly
adaptive lasso_ (HAL), a nonparametric regression estimator that applies
L1-regularized lasso regression to a design matrix composed of indicator
functions corresponding to the support of the functional over a set of
covariates and interactions thereof. HAL regression allows for arbitrarily
complex functional forms to be estimated at fast (near-parametric) convergence
rates under only global smoothness assumptions [@vdl2017generally;
@bibaut2019fast]. For detailed theoretical discussions of the highly adaptive
lasso estimator, consider consulting, for example, @vdl2017generally,
@vdl2017finite, and @vdl2017uniform. For a computational demonstration of the
versatility of HAL regression, see @benkeser2016hal. Recent theoretical works
have demonstrated success in building efficient estimators of complex
parameters when particular variations of HAL regression are used to estimate
nuisance parameters [e.g., @vdl2019efficient; @ertefaie2020nonparametric].

---

## Installation

For standard use, we recommend installing the package from
[CRAN](https://CRAN.R-project.org/package=hal9001) via

```{r cran-installation, eval = FALSE}
install.packages("hal9001")
```

To contribute, install the _development version_ of `hal9001` from GitHub via
[`remotes`](https://CRAN.R-project.org/package=remotes):

```{r gh-master-installation, eval = FALSE}
remotes::install_github("tlverse/hal9001")
```

---

## Issues

If you encounter any bugs or have any specific feature requests, please [file an
issue](https://github.com/tlverse/hal9001/issues).

---

## Example

Consider the following minimal example in using `hal9001` to generate
predictions via Highly Adaptive Lasso regression:

```{r example}
# load the package and set a seed
library(hal9001)
set.seed(385971)

# simulate data
n <- 100
p <- 3
x <- matrix(rnorm(n * p), n, p)
y <- x[, 1] * sin(x[, 2]) + rnorm(n, mean = 0, sd = 0.2)

# fit the HAL regression
hal_fit <- fit_hal(X = x, Y = y, yolo = TRUE)
hal_fit$times

# training sample prediction
preds <- predict(hal_fit, new_data = x)
mean(hal_mse <- (preds - y)^2)
```

---

## Contributions

Contributions are very welcome. Interested contributors should consult our
[contribution
guidelines](https://github.com/tlverse/hal9001/blob/master/CONTRIBUTING.md)
prior to submitting a pull request.

---

## Citation

After using the `hal9001` R package, please cite both of the following:

        @software{coyle2022hal9001-rpkg,
          author = {Coyle, Jeremy R and Hejazi, Nima S and Phillips, Rachael V
            and {van der Laan}, Lars and {van der Laan}, Mark J},
          title = {{hal9001}: The scalable highly adaptive lasso},
          year  = {2022},
          url = {https://doi.org/10.5281/zenodo.3558313},
          doi = {10.5281/zenodo.3558313}
          note = {{R} package version 0.4.2}
        }

        @article{hejazi2020hal9001-joss,
          author = {Hejazi, Nima S and Coyle, Jeremy R and {van der Laan}, Mark
            J},
          title = {{hal9001}: Scalable highly adaptive lasso regression in
            {R}},
          year  = {2020},
          url = {https://doi.org/10.21105/joss.02526},
          doi = {10.21105/joss.02526},
          journal = {Journal of Open Source Software},
          publisher = {The Open Journal}
        }

---

## License

© 2017-2024 [Jeremy Coyle](https://github.com/tlverse) and [Nima
Hejazi](https://nimahejazi.org)

The contents of this repository are distributed under the GPL-3 license. See
file `LICENSE` for details.

---

## References

Owner

  • Name: tlverse
  • Login: tlverse
  • Kind: organization
  • Location: Berkeley, CA, USA

An extensible ecosystem of R packages for targeted causal machine learning

JOSS Publication

hal9001: Scalable highly adaptive lasso regression in R
Published
September 26, 2020
Volume 5, Issue 53, Page 2526
Authors
Nima S. Hejazi ORCID
Graduate Group in Biostatistics, University of California, Berkeley, Division of Biostatistics, School of Public Health, University of California, Berkeley, Center for Computational Biology, University of California, Berkeley
Jeremy R. Coyle ORCID
Division of Biostatistics, School of Public Health, University of California, Berkeley
Mark J. van der Laan ORCID
Division of Biostatistics, School of Public Health, University of California, Berkeley, Department of Statistics, University of California, Berkeley, Center for Computational Biology, University of California, Berkeley
Editor
Mikkel Meyer Andersen ORCID
Tags
machine learning targeted learning causal inference

GitHub Events

Total
  • Issues event: 1
  • Watch event: 1
  • Push event: 2
  • Pull request event: 3
Last Year
  • Issues event: 1
  • Watch event: 1
  • Push event: 2
  • Pull request event: 3

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 444
  • Total Committers: 10
  • Avg Commits per committer: 44.4
  • Development Distribution Score (DDS): 0.534
Past Year
  • Commits: 1
  • Committers: 1
  • Avg Commits per committer: 1.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Nima Hejazi nh@n****g 207
rachaelvp r****s@g****m 96
Jeremy Coyle j****e@g****m 54
Lars van der Laan V****s@y****m 47
Lars van der Laan l****n@b****u 23
Wilson Cai w****i@b****u 8
YuZhang2019 y****2@o****m 3
David McCoy 5****s 3
Benkeser b****r@e****u 2
Katrin Leinweber k****i@p****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 41
  • Total pull requests: 70
  • Average time to close issues: 10 months
  • Average time to close pull requests: 28 days
  • Total issue authors: 21
  • Total pull request authors: 11
  • Average comments per issue: 2.05
  • Average comments per pull request: 1.37
  • Merged pull requests: 62
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 1
  • Average time to close issues: 4 months
  • Average time to close pull requests: 20 minutes
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • nhejazi (15)
  • jlstiles (3)
  • benkeser (3)
  • rrrlw (2)
  • daviddewhurst (2)
  • rachaelvp (1)
  • tcovert (1)
  • SeraphinaShi (1)
  • Larsvanderlaan (1)
  • jaganmn (1)
  • HughSt (1)
  • matthewvowels1 (1)
  • Lauren-EylerDang (1)
  • prockenschaub (1)
  • jeremyrcoyle (1)
Pull Request Authors
  • nhejazi (32)
  • Larsvanderlaan (13)
  • jeremyrcoyle (10)
  • rachaelvp (6)
  • wilsoncai1992 (4)
  • YuZhang2019 (2)
  • tq21 (2)
  • katrinleinweber (1)
  • WenxinZhang25 (1)
  • blind-contours (1)
  • benkeser (1)
Top Labels
Issue Labels
enhancement (12) bug (9) question (6) help wanted (1)
Pull Request Labels
enhancement (15) bug (4)

Packages

  • Total packages: 1
  • Total downloads:
    • cran 1,496 last-month
  • Total dependent packages: 3
  • Total dependent repositories: 26
  • Total versions: 7
  • Total maintainers: 1
cran.r-project.org: hal9001

The Scalable Highly Adaptive Lasso

  • Versions: 7
  • Dependent Packages: 3
  • Dependent Repositories: 26
  • Downloads: 1,496 Last month
Rankings
Forks count: 4.8%
Dependent repos count: 5.4%
Stargazers count: 7.0%
Average: 7.5%
Downloads: 9.6%
Dependent packages count: 10.9%
Maintainers (1)
Last synced: 4 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.1.0 depends
  • Rcpp * depends
  • Matrix * imports
  • assertthat * imports
  • data.table * imports
  • glmnet * imports
  • methods * imports
  • origami >= 1.0.3 imports
  • stats * imports
  • stringr * imports
  • utils * imports
  • SuperLearner * suggests
  • dplyr * suggests
  • future * suggests
  • ggplot2 * suggests
  • knitr * suggests
  • microbenchmark * suggests
  • rmarkdown * suggests
  • survival * suggests
  • testthat * suggests
  • tidyr * suggests
.github/workflows/R-CMD-check.yml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v1 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-tinytex v1 composite