additive-sparse-boost-regression

A Python Package for a Sparse Additive Boosting Regressor

https://github.com/thesis-jdgs/additive-sparse-boost-regression

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.8%) to scientific vocabulary

Keywords

additive-models data-science explainable-ai feature-selection gradient-boosted-trees python3 sparse-regression

Last synced: 6 months ago · JSON representation ·

Repository

A Python Package for a Sparse Additive Boosting Regressor

Basic Info

Host: GitHub
Owner: thesis-jdgs
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 2.24 MB

Statistics

Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Topics

additive-models data-science explainable-ai feature-selection gradient-boosted-trees python3 sparse-regression

Created almost 3 years ago · Last pushed about 2 years ago

Metadata Files

Readme License Citation

AdditiveSparseBoostRegressor

This repository holds the implementation for an additive regressor, i.e.:

math y\approx f(x) = \beta + \sum_{i=1}^M f_i(x_i)

Where $f_i$ functions with zero-mean and $\beta$ is the intercept, both parameters are estimated by minimizing the squared error loss function.

The functions $f_i$ are either a piecewise constant function (trained as ensemble of decision trees), or the trivial zero-function, that is to say, some features are ignored, making the regressor sparse.

This is achieved by combining the boosting algorithm with a modification of mRMR (minimum Redundancy Maximum Relevance) feature selection at each boosting iteration.

Installation

The package can be installed with pip:

bash pip install git+https://github.com/thesis-jdgs/additive-sparse-boost-regression.git

Note: The package is not yet available on PyPI.

Usage

The regressor is implemented in asboostreg.py, and implements the familiar fit and predict methods from scikit-learn.:

```python from asboostreg import SparseAdditiveBoostingRegressor from sklearn.datasets import load_boston

X, y = loadboston(returnXy=True) sparsereg = SparseAdditiveBoostingRegressor( learningrate=0.01, nestimators=10000, l2regularization=2.0, maxdepth=6, rowsubsample=0.632, randomstate=0, niternochange=30, ) sparsereg.fit(X, y) ypred = sparsereg.predict(X) ```

To inspect the general characteristics of the model, the plot_model_information method creates a plotly figure:

python sparsereg.plot_model_information() Which creates a figure of the iteration history and model complexity for each feature, like the following:

Iteration history Complexity

To inspect the predictions for a dataset X, you can use explain(X):

python sparsereg.explain(X) Which creates a figure of the mean importances of each feature, and a plot of the 1D regressor for each selected feature:

Mean Importances 1D Plot Example

We can also decompose the predictions into the additive components, with the contribution_frame method:

python sparsereg.contribution_frame(X) Which returns a pandas DataFrame with the additive components for each feature.

Owner

Name: thesis-jdgs
Login: thesis-jdgs
Kind: organization

Repositories: 1
Profile: https://github.com/thesis-jdgs

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Godoy Sánchez"
  given-names: "Johnny Godoy"
  orcid: "https://orcid.org/0009-0000-1202-1106"
title: "additive-sparse-boost-regression"
version: 1.0.0
date-released: 2023-05-21
url: "https://github.com/thesis-jdgs/additive-sparse-boost-regression"

GitHub Events

Total

Last Year

Dependencies

requirements.txt pypi

attrs *
category_encoders *
numpy *
pandas *
plotly *
scikit-image *
scikit-learn *

requirements_notebooks.txt pypi

interpret *
jupyter *
xgboost *

setup.py pypi

line.strip *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science