GBNet

GBNet: Gradient Boosting packages integrated into PyTorch - Published in JOSS (2025)

https://github.com/mthorrell/gbnet

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 6 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Scientific Fields

Earth and Environmental Sciences Physical Sciences - 40% confidence
Economics Social Sciences - 40% confidence
Last synced: 4 months ago · JSON representation

Repository

Gradient Boosting Modules for pytorch

Basic Info
  • Host: GitHub
  • Owner: mthorrell
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 1.42 MB
Statistics
  • Stars: 44
  • Watchers: 2
  • Forks: 1
  • Open Issues: 8
  • Releases: 18
Created over 1 year ago · Last pushed 5 months ago
Metadata Files
Readme License

README.md

GBNet

DOI

Pytorch Modules for XGBoost and LightGBM

Table of Contents

  1. Introduction
  2. Install and Docs
  3. Pytorch Modules
  4. Models
  5. Contributing
  6. Cite this work

Introduction

XGBoost and LightGBM are industry-standard gradient boosting packages used to solve tabular data machine learning problems. Users of these packages wishing to define custom loss functions, novel architectures, or other advanced modeling scenarios, however, may face substantial difficulty due to potentially complex gradient and Hessian calculations required by both XGBoost and LightGBM. GBNet provides PyTorch Modules wrapping XGBoost and LightGBM so that users can construct and fit nearly arbitrary model architectures involving XGBoost or LightGBM without requiring users to provide gradient and Hessian calculations. PyTorch's autograd system calculates derivative information automatically; GBNet orchestrates delivery of that information back to the boosting algorithms. GBNet, by linking XGBoost and LightGBM to PyTorch, expands the set of applications for gradient boosting models.

There are two main components of gbnet:

  • (1) gbnet.xgbmodule, gbnet.lgbmodule and gbnet.gblinear provide the Pytorch Modules that allow fitting of XGBoost, LightGBM and Boosted Linear models using Pytorch's computational network and differentiation capabilities.

    • For example, if $F(X)$ is the output of an XGBoost model, you can use Pytorch to define the loss function, $L(y, F(X))$. Pytorch handles the gradients of $L$ so, as a user, you only specify the loss function.
    • You can also fit two (or more) boosted models together with Pytorch-supported parametric components. For instance, a recommendation prediction might look like this: $\sigma(F(user) \times G(item))$ where both $F$ and $G$ are separate boosting models producing embeddings of users and items respectively. gbnet makes defining and fitting such a model almost as easy as using Pytorch itself.
  • (2) gbnet.models provides specific example estimators that accomplish things that were not previously possible using only XGBoost or LightGBM. Current models:

    • Forecast is a forecasting model similar in execution to Metas' Prophet algorithm. In the settings we tested, gbnet.models.forecasting.Forecast beats the performance of Meta's Prophet algorithm (see the forecasting PR for a comparison).
    • GBOrd is Ordinal Regression using GBMs (both XGBoost and LightGBM supported). The complex loss function (with fitable parameters) is specified in PyTorch and put on top of either XGBModule or LGBModule.
    • Other models with plans to be integrated are time-varying Survival analysis and more with NLP.

Install and Docs

pip install gbnet

Troubleshooting and dependencies

Use of virtual environments/conda is best practice when installing GBNet. GBNet requires XGBoost, LightGBM and PyTorch as key dependencies and may use these packages simultaneously. Each of these packages rely on OpenMP implementations for parallelization. Conflicts in the OpenMP implementations will throw warnings and may produce slow or incorrect outputs. Prior to installing these python dependencies, it is best to ensure each of these dependencies point to a single OpenMP implementation. Apple Silicon users may prefer to install libomp via brew prior to the python package dependency installations (see, for example, build notes for XGBoost for additional details).

Docs

https://gbnet.readthedocs.io/

Pytorch Modules

There are currently three Pytorch Modules in gbnet: lgbmodule.LGBModule, xgbmodule.XGBModule and gblinear.GBLinear. These create the interface between PyTorch and the boosting algorithms. LightGBM and XGBoost are wrapped in LGBModule and XGBModule respectively. GBLinear is a linear layer that is trained with boosting (rather than gradient descent) -- for some applications it trains much faster than gradient descent (see this PR for details).

Conceptually, how can Pytorch be used to fit XGBoost or LightGBM models?

Gradient Boosting Machines only require gradients and, for modern packages, hessians to train. Pytorch (and other neural network packages) calculates gradients and hessians. GBMs can therefore be fit as the first layer in neural networks using Pytorch.

CatBoost is also supported but in an experimental capacity since the current gbnet integration with CatBoost is not as performant as the other GBDT packages.

Is training a gbnet model closer to training a neural network or to training a GBM?

It's closer to training a GBM. Currently, the biggest difference between training using gbnet vs basic torch, is that gbnet, like basic usage of xgboost and lightgbm, requires the entire dataset to be fed in. Cached predictions allow these packages to train quickly, and caching cannot happen if input batches change with each training/boosting round. There are some ways around this but there is currently no native functionality in gbnet for true batch training. Additional info is provided in #12.

Basic training of a GBM for comparison to existing gradient boosting packages

```python import time

import lightgbm as lgb import numpy as np import xgboost as xgb import torch

from gbnet import lgbmodule, xgbmodule

Generate Dataset

np.random.seed(100) n = 1000 inputdim = 20 outputdim = 1 X = np.random.random([n, inputdim]) B = np.random.random([inputdim, outputdim]) Y = X.dot(B) + np.random.random([n, outputdim])

iters = 100 t0 = time.time()

XGBoost training for comparison

xbst = xgb.train( params={'objective': 'reg:squarederror', 'basescore': 0.0}, dtrain=xgb.DMatrix(X, label=Y), numboost_round=iters ) t1 = time.time()

LightGBM training for comparison

lbst = lgb.train( params={'verbose':-1}, trainset=lgb.Dataset(X, label=Y.flatten(), initscore=[0 for i in range(n)]), numboostround=iters ) t2 = time.time()

XGBModule training

xnet = xgbmodule.XGBModule(n, inputdim, outputdim, params={}) xmse = torch.nn.MSELoss()

Xdmatrix = xgb.DMatrix(X) for i in range(iters): xnet.zerograd() xpred = xnet(X_dmatrix)

loss = 1/2 * xmse(xpred, torch.Tensor(Y))  # xgboost uses 1/2 (Y - P)^2
loss.backward(create_graph=True)

xnet.gb_step()

xnet.eval() # like any torch module, use eval mode for predictions t3 = time.time()

LGBModule training

lnet = lgbmodule.LGBModule(n, inputdim, outputdim, params={}) lmse = torch.nn.MSELoss()

Xdataset = lgb.Dataset(X) for i in range(iters): lnet.zerograd() lpred = lnet(X_dataset)

loss = lmse(lpred, torch.Tensor(Y))
loss.backward(create_graph=True)

lnet.gb_step()

lnet.eval() # use eval mode for predictions t4 = time.time()

print(np.max(np.abs(xbst.predict(xgb.DMatrix(X)) - xnet(X_dmatrix).detach().numpy().flatten()))) # 9.537e-07 print(np.max(np.abs(lbst.predict(X) - lnet(X).detach().numpy().flatten()))) # 2.479e-07 print(f'xgboost time: {t1 - t0}') # 0.089 print(f'lightgbm time: {t2 - t1}') # 0.084 print(f'xgbmodule time: {t3 - t2}') # 0.166 print(f'lgbmodule time: {t4 - t3}') # 0.123 ```

Training XGBoost and LightGBM together

```python import time

import numpy as np import torch

from gbnet import lgbmodule, xgbmodule

Create new module that jointly trains multi-output xgboost and lightgbm models

the outputs of these gbm models is then combined by a linear layer

class GBPlus(torch.nn.Module): def init(self, inputdim, intermediatedim, outputdim): super(GBPlus, self).init_()

    self.xgb = xgbmodule.XGBModule(n, input_dim, intermediate_dim, {'eta': 0.1})
    self.lgb = lgbmodule.LGBModule(n, input_dim, intermediate_dim, {'eta': 0.1})
    self.linear = torch.nn.Linear(intermediate_dim, output_dim)

def forward(self, input_array):
    xpreds = self.xgb(input_array)
    lpreds = self.lgb(input_array)
    preds = self.linear(xpreds + lpreds)
    return preds

def gb_step(self):
    self.xgb.gb_step()
    self.lgb.gb_step()

Generate Dataset

np.random.seed(100) n = 1000 inputdim = 10 outputdim = 1 X = np.random.random([n, inputdim]) B = np.random.random([inputdim, outputdim]) Y = X.dot(B) + np.random.random([n, outputdim])

intermediatedim = 10 gbp = GBPlus(inputdim, intermediatedim, outputdim) mse = torch.nn.MSELoss() optimizer = torch.optim.Adam(gbp.parameters(), lr=0.005)

t0 = time.time() losses = [] for i in range(100): optimizer.zero_grad() preds = gbp(X)

loss = mse(preds, torch.Tensor(Y))
loss.backward(create_graph=True)  # create_graph=True required for any gbnet
losses.append(loss.detach().numpy().copy())

gbp.gb_step()  # required to update the gbms
optimizer.step()

t1 = time.time() print(t1 - t0) # 5.821 ```

image

Models

Forecasting

gbnet.models.forecasting.Forecast outperforms Meta's popular Prophet algorithm on basic benchmarks (see the forecasting PR for a comparison). Starter comparison code:

```python import pandas as pd from prophet import Prophet from sklearn.metrics import rootmeansquared_error

from gbnet.models import forecasting

Load and split data

url = "https://raw.githubusercontent.com/facebook/prophet/main/examples/exampleyosemitetemps.csv" df = pd.readcsv(url) df['ds'] = pd.todatetime(df['ds'])

train = df[df['ds'] < df['ds'].median()].resetindex(drop=True).copy() test = df[df['ds'] >= df['ds'].median()].resetindex(drop=True).copy()

train and predict comparing out-of-the-box gbnet & prophet

gbnet

gbnetforecastmodel = forecasting.Forecast() gbnetforecastmodel.fit(train, train['y']) test['gbnetpred'] = gbnetforecast_model.predict(test)['yhat']

prophet

prophetmodel = Prophet() prophetmodel.fit(train) test['prophetpred'] = prophetmodel.predict(test)['yhat']

sel = test['y'].notnull() print(f"gbnet rmse: {rootmeansquarederror(test[sel]['y'], test[sel]['gbnet_pred'])}") print(f"prophet rmse: {rootmeansquarederror(test[sel]['y'], test[sel]['prophet_pred'])}")

gbnet rmse: 8.757314439339462

prophet rmse: 20.10509806878121

```

Ordinal Regression

See this notebook for examples.

```python from gbnet.models import ordinal_regression

sklearnestimator = ordinalregression.GBOrd(num_classes=10) ```

Contributing

Contributions are welcome! Here are some ways you can help:

  • Report bugs and request features by opening issues
  • Submit pull requests with bug fixes or new features
  • Improve documentation and examples
  • Add tests to increase code coverage

Before submitting a pull request:

  1. Fork the repository and create a new branch
  2. Add tests for any new functionality
  3. Ensure all tests pass by running pytest
  4. Update documentation as needed
  5. Follow the existing code style

For major changes, please open an issue first to discuss what you would like to change.

Cite this work

Horrell, M., (2025). GBNet: Gradient Boosting packages integrated into PyTorch. Journal of Open Source Software, 10(111), 8047, https://doi.org/10.21105/joss.08047

Owner

  • Login: mthorrell
  • Kind: user

JOSS Publication

GBNet: Gradient Boosting packages integrated into PyTorch
Published
July 07, 2025
Volume 10, Issue 111, Page 8047
Authors
Michael Horrell ORCID
Independent Researcher, USA
Editor
Øystein Sørensen ORCID
Tags
PyTorch Gradient Boosting XGBoost LightGBM

GitHub Events

Total
  • Create event: 55
  • Issues event: 25
  • Release event: 11
  • Watch event: 26
  • Delete event: 41
  • Issue comment event: 49
  • Push event: 125
  • Pull request event: 84
  • Fork event: 1
Last Year
  • Create event: 55
  • Issues event: 25
  • Release event: 11
  • Watch event: 26
  • Delete event: 41
  • Issue comment event: 49
  • Push event: 125
  • Pull request event: 84
  • Fork event: 1

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 11
  • Total pull requests: 31
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 3 days
  • Total issue authors: 3
  • Total pull request authors: 2
  • Average comments per issue: 1.73
  • Average comments per pull request: 0.32
  • Merged pull requests: 18
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 10
  • Pull requests: 31
  • Average time to close issues: 23 days
  • Average time to close pull requests: 3 days
  • Issue authors: 3
  • Pull request authors: 2
  • Average comments per issue: 1.2
  • Average comments per pull request: 0.32
  • Merged pull requests: 18
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • mthorrell (9)
  • bahung (2)
  • cswiercz (1)
  • DeblueJenkins (1)
  • cswiercz-sentilink (1)
Pull Request Authors
  • mthorrell (44)
  • osorensen (2)
Top Labels
Issue Labels
enhancement (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 155 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 13
  • Total maintainers: 1
pypi.org: gbnet

Gradient boosting libraries integrated with pytorch

  • Versions: 13
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 155 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.8%
Dependent repos count: 57.4%
Maintainers (1)
Last synced: 4 months ago