compboost

compboost: Modular Framework for Component-Wise Boosting - Published in JOSS (2018)

https://github.com/schalkdaniel/compboost

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 7 DOI reference(s) in README and JOSS metadata
✓
Academic publication links
Links to: joss.theoj.org
○
Committers with academic emails
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Keywords

boosting-framework interpretable-machine-learning machine-learning

Last synced: 6 months ago · JSON representation

Repository

C++ implementation and R API for componentwise boosting

Basic Info

Host: GitHub
Owner: schalkdaniel
License: lgpl-3.0
Language: C++
Default Branch: main
Homepage: https://schalkdaniel.github.io/compboost/
Size: 203 MB

Statistics

Stars: 23
Watchers: 1
Forks: 3
Open Issues: 27
Releases: 0

Topics

boosting-framework interpretable-machine-learning machine-learning

Created over 8 years ago · Last pushed almost 3 years ago

Metadata Files

Readme Changelog Contributing License Code of conduct

README.Rmd

---
output: github_document
---



```{r, include=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/"
)

library(compboost)
ggplot2::theme_set(ggthemes::theme_tufte())
ggplot = function(...) ggplot2::ggplot(...) + scale_color_brewer(palette = "Set1")

set.seed(31415)
```

# compboost: Fast and Flexible Component-Wise Boosting Framework 

[![R-CMD-check](https://github.com/schalkdaniel/compboost/workflows/R-CMD-check/badge.svg)](https://github.com/schalkdaniel/compboost/actions)
[![codecov](https://codecov.io/gh/schalkdaniel/compboost/branch/main/graph/badge.svg?token=t3Xxoxz1T2)](https://codecov.io/gh/schalkdaniel/compboost)
 [![License: LGPL v3](https://img.shields.io/badge/License-LGPL_v3-blue.svg)](https://www.gnu.org/licenses/lgpl-3.0) [![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/compboost)](https://cran.r-project.org/package=compboost) [![status](http://joss.theoj.org/papers/94cfdbbfdfc8796c5bdb1a74ee59fcda/status.svg)](http://joss.theoj.org/papers/94cfdbbfdfc8796c5bdb1a74ee59fcda)

[Documentation](https://danielschalk.com/compboost/) |
[Contributors](CONTRIBUTORS.md) |
[Release Notes](NEWS.md)

## Overview

Component-wise boosting applies the boosting framework to
statistical models, e.g., general additive models using component-wise smoothing
splines. Boosting these kinds of models maintains interpretability and enables
unbiased model selection in high dimensional feature spaces.

The `R` package `compboost` is an alternative implementation of component-wise
boosting written in `C++` to obtain high runtime
performance and full memory control. The main idea is to provide a modular
class system which can be extended without editing the
source code. Therefore, it is possible to use `R` functions as well as
`C++` functions for custom base-learners, losses, logging mechanisms or
stopping criteria.

For an introduction and overview about the functionality visit the [project page](https://schalkdaniel.github.io/compboost/).

## Installation

#### Developer version:

```r
devtools::install_github("schalkdaniel/compboost")
```

## Examples

The examples are rendered using compboost `r packageVersion("compboost")`.

The fastest way to train a `Compboost` model is to use the wrapper functions `boostLinear()` or `boostSplines()`:
```{r, results="hide", warning=FALSE, fig.width=10, fig.height=2, out.width="100%", }
cboost = boostSplines(data = iris, target = "Sepal.Length",
  oob_fraction = 0.3, iterations = 500L, trace = 100L)

ggrisk = plotRisk(cboost)
ggpe = plotPEUni(cboost, "Petal.Length")
ggicont =  plotIndividualContribution(cboost, iris[70, ], offset = FALSE)

library(patchwork)

ggrisk + ggpe + ggicont
```

For more extensive examples and how to use the `R6` interface visit the [project page](https://danielschalk.com/compboost/articles/getting_started/use_case.html).

## mlr learner

Compboost also ships an [`mlr3`](https://mlr3.mlr-org.com/) learners for regression and binary classification which can be used to apply `compboost` within the whole [`mlr3verse`](https://mlr3.mlr-org.com/):

```{r}
library(mlr3)

ts = tsk("spam")
lcboost = lrn("classif.compboost", iterations = 500L, bin_root = 2)
lcboost$train(ts)
lcboost$predict_type = "prob"
lcboost$predict(ts)

# Access the `$model` field to access all the `compboost` functionality:
plotBaselearnerTraces(lcboost$model) +
  plotPEUni(lcboost$model, "charDollar")
```

## Save and load models

Because of the usage of `C++` objects as backend, it is not possible to use `R`s `save()` method to save models. Instead, use `$saveToJson("mymodel.json")` to save the model to `mymodel.json` and `Compboost$new(file = "mymodel.json")` to load the model:

```{r, eval=FALSE}
cboost = boostSplines(iris, "Sepal.Width")
cboost$saveToJson("mymodel.json")

cboost_new = Compboost$new(file = "mymodel.json")

# Save the model without data:
cboost$saveToJson("mymodel_without_data.json", rm_data = TRUE)
```
```{r, include=FALSE}
file.remove("mymodel.json", "mymodel_without_data.json")
```

## Benchmark

- A small benchmark was conducted to compare `compboost` with [`mboost`](https://cran.r-project.org/web/packages/mboost/index.html). For this purpose, the runtime behavior and memory consumption of the two packages were compared. The results of the benchmark can be read [here](https://github.com/schalkdaniel/compboost/tree/master/benchmark).
- A bigger benchmark with adaptions to increase the runtime and memory efficiency can be found [here](https://doi.org/10.1080/10618600.2022.2116446).

## Citing

To cite `compboost` in publications, please use:

> Schalk et al., (2018). compboost: Modular Framework for Component-Wise Boosting. Journal of Open Source Software, 3(30), 967, https://doi.org/10.21105/joss.00967

```
@article{schalk2018compboost,
  author = {Daniel Schalk, Janek Thomas, Bernd Bischl},
  title = {compboost: Modular Framework for Component-Wise Boosting},
  URL = {https://doi.org/10.21105/joss.00967},
  year = {2018},
  publisher = {Journal of Open Source Software},
  volume = {3},
  number = {30},
  pages = {967},
  journal = {JOSS}
}
```

## Testing

### On your local machine

In order to test the package functionality you can use devtools to test the package on your local machine:

```{r, eval=FALSE}
devtools::test()
```

Owner

Name: Daniel Schalk
Login: schalkdaniel
Kind: user
Location: Munich
Company: @slds-lmu

Website: danielschalk.com
Repositories: 6
Profile: https://github.com/schalkdaniel

JOSS Publication

compboost: Modular Framework for Component-Wise Boosting

Published

October 12, 2018

DOI

10.21105/joss.00967

Volume 3, Issue 30, Page 967

Authors

Daniel Schalk

Department of Statistics, LMU Munich

Janek Thomas

Department of Statistics, LMU Munich

Bernd Bischl

Department of Statistics, LMU Munich

Editor

Roman Valls Guimera

GitHub Events

Total

Last Year

Committers

Last synced: 7 months ago

All Time

Total Commits: 1,070
Total Committers: 17
Avg Commits per committer: 62.941
Development Distribution Score (DDS): 0.14

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
schalkdaniel	d**k@t**e	920
schalkdaniel	d**k@m**e	99
Janek Thomas	j**s@w**e	15
Maximilian Kaiser	3****t	11
Daniel Schalk	s**2@g**m	7
Debian	d**n@b**l	5
Shawn	s**s@y**m	3
runner	r**r@M**l	1
runner	r**r@M**l	1
runner	r**r@M**l	1
runner	r**r@M**l	1
runner	r**r@M**l	1
runner	r**r@M**l	1
runner	r**r@M**l	1
runner	r**r@M**l	1
Quay	q**u@g**m	1
Michel Lang	m**g@g**m	1

Committer Domains (Top 20 + Academic)

bigger-benchmarks.novalocal: 1 mail.de: 1 t-online.de: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 64
Total pull requests: 36
Average time to close issues: about 1 year
Average time to close pull requests: 12 days
Total issue authors: 4
Total pull request authors: 3
Average comments per issue: 0.66
Average comments per pull request: 0.28
Merged pull requests: 36
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

schalkdaniel (58)
SGolbert (3)
illusive-git (2)
VladPerervenko (1)

Pull Request Authors

schalkdaniel (29)
illusive-git (6)
QuayAu (1)

Top Labels

Issue Labels

Implementation (20) Enhancement (17) important (12) Optimizer (3) Bug (3) Documentation (3) Should not take too long (2) Base-Learner (2) Simplification (2) Performance (2) Loss (2) Naming (1) Data (1) Response (1) Logger (1)

Pull Request Labels

Dependencies

DESCRIPTION cran

R >= 3.4.0 depends
R6 * imports
Rcpp >= 0.11.2 imports
checkmate * imports
glue * imports
methods * imports
RcppArmadillo >= 0.9.100.5.0 suggests
covr * suggests
ggplot2 * suggests
ggrepel * suggests
ggthemes * suggests
gridExtra * suggests
knitr * suggests
mboost * suggests
mlr * suggests
pkgdown * suggests
rmarkdown * suggests
rpart * suggests
testthat * suggests
titanic * suggests

.github/workflows/R-CMD-check.yaml actions

actions/checkout v2 composite
actions/upload-artifact main composite
r-lib/actions/check-r-package v2 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

.github/workflows/pkgdown.yaml actions

actions/checkout v2 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

.github/workflows/render-readme.yaml actions

actions/checkout v2 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite

.github/workflows/test-coverage.yaml actions

actions/checkout v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

compboost

Science Score: 93.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.Rmd

Owner

JOSS Publication

compboost: Modular Framework for Component-Wise Boosting

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies