stochtree

Stochastic tree ensembles (BART / XBART) for supervised learning and causal inference

https://github.com/stochastictree/stochtree

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.7%) to scientific vocabulary

Keywords

bart bayesian-machine-learning bayesian-methods decision-trees gradient-boosted-trees machine-learning probabilistic-models tree-ensembles
Last synced: 6 months ago · JSON representation

Repository

Stochastic tree ensembles (BART / XBART) for supervised learning and causal inference

Basic Info
  • Host: GitHub
  • Owner: StochasticTree
  • License: other
  • Language: C++
  • Default Branch: main
  • Homepage: https://stochtree.ai/
  • Size: 39.6 MB
Statistics
  • Stars: 53
  • Watchers: 4
  • Forks: 14
  • Open Issues: 18
  • Releases: 0
Topics
bart bayesian-machine-learning bayesian-methods decision-trees gradient-boosted-trees machine-learning probabilistic-models tree-ensembles
Created over 2 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog License

README.md

StochTree

C++ Tests Python Tests R Tests

Software for building stochastic tree ensembles (i.e. BART, XBART) for supervised learning and causal inference.

Getting Started

stochtree is composed of a C++ "core" and R / Python interfaces to that core. Details on installation and use are available below:

Python Package

The python package is not yet on PyPI but can be installed from source using pip's git interface. To proceed, you will need a working version of git and python 3.8 or greater (available from several sources, one of the most straightforward being the anaconda suite).

Quick start

Without worrying about virtual environments (detailed further below), stochtree can be installed from the command line

pip install numpy scipy pytest pandas scikit-learn pybind11 pip install git+https://github.com/StochasticTree/stochtree.git

Virtual environment installation

Often, users prefer to manage different projects (with different package / python version requirements) in virtual environments.

Conda

Conda provides a straightforward experience in managing python dependencies, avoiding version conflicts / ABI issues / etc.

To build stochtree using a conda based workflow, first create and activate a conda environment with the requisite dependencies

{bash} conda create -n stochtree-dev -c conda-forge python=3.10 numpy scipy pytest pandas pybind11 scikit-learn matplotlib seaborn conda activate stochtree-dev

Then install the package from github via pip

{bash} pip install git+https://github.com/StochasticTree/stochtree.git

(Note: if you'd also like to run stochtree's notebook examples, you will also need jupyterlab, seaborn, and matplotlib)

{bash} conda install matplotlib seaborn pip install jupyterlab

With these dependencies installed, you can clone the repo and run the demo/ examples.

Venv

You could also use venv for environment management. First, navigate to the folder in which you usually store virtual environments (i.e. cd /path/to/envs) and create and activate a virtual environment:

{bash} python -m venv venv source venv/bin/activate

Install all of the package (and demo notebook) dependencies

{bash} pip install numpy scipy pytest pandas scikit-learn pybind11

Then install stochtree via

{bash} pip install git+https://github.com/StochasticTree/stochtree.git

As above, if you'd like to run the notebook examples in the demo/ subfolder, you will also need jupyterlab, seaborn, and matplotlib and you will have to clone the repo

{bash} pip install matplotlib seaborn jupyterlab

R Package

The R package can be installed from CRAN via

install.packages("stochtree")

The development version of stochtree can be installed from Github via

remotes::install_github("StochasticTree/stochtree", ref="r-dev")

C++ Core

While the C++ core links to both R and Python for a performant, high-level interface, the C++ code can be compiled and unit-tested and compiled into a standalone debug program.

Compilation

Cloning the Repository

To clone the repository, you must have git installed, which you can do following these instructions.

Once git is available at the command line, navigate to the folder that will store this project (in bash / zsh, this is done by running cd followed by the path to the directory). Then, clone the stochtree repo as a subfolder by running {bash} git clone --recursive https://github.com/StochasticTree/stochtree.git

NOTE: this project incorporates several dependencies as git submodules, which is why the --recursive flag is necessary (some systems may perform a recursive clone without this flag, but --recursive ensures this behavior on all platforms). If you have already cloned the repo without the --recursive flag, you can retrieve the submodules recursively by running git submodule update --init --recursive in the main repo directory.

CMake Build

The C++ project can be built independently from the R / Python packages using cmake. See here for details on installing cmake (alternatively, on MacOS, cmake can be installed using homebrew). Once cmake is installed, you can build the CLI by navigating to the main project directory at your command line (i.e. cd /path/to/stochtree) and running the following code

{bash} rm -rf build mkdir build cmake -S . -B build cmake --build build

The CMake build has two primary targets, which are detailed below

Debug Program

debug/api_debug.cpp defines a standalone target that can be straightforwardly run with a debugger (i.e. lldb, gdb) while making non-trivial changes to the C++ code. This debugging program is compiled as part of the CMake build if the BUILD_DEBUG_TARGETS option in CMakeLists.txt is set to ON.

Once the program has been built, it can be run from the command line via ./build/debugstochtree or attached to a debugger via lldb ./build/debugstochtree (clang) or gdb ./build/debugstochtree (gcc).

Unit Tests

We test stochtree using the GoogleTest framework. Unit tests are compiled into a single target as part of the CMake build if the BUILD_TEST option is set to ON and the test suite can be run after compilation via ./build/teststochtree

Xcode

While using gdb or lldb on debugstochtree at the command line is very helpful, users may prefer debugging in a full-fledged IDE like xcode. This project's C++ core can be converted to an xcode project from CMakeLists.txt, but first you must turn off sanitizers (xcode seems to have its own way of setting this at build time for different configurations, and having injected -fsanitize=address statically into compiler arguments will cause xcode errors). To do this, modify the USE_SANITIZER line in CMakeLists.txt:

option(USE_SANITIZER "Use santizer flags" OFF)

To generate an XCode project based on the build targets and specifications defined in a CMakeLists.txt, navigate to the main project folder (i.e. cd /path/to/project) and run the following commands:

{bash} rm -rf xcode/ mkdir xcode cd xcode cmake -G Xcode .. -DCMAKE_C_COMPILER=cc -DCMAKE_CXX_COMPILER=c++ -DUSE_SANITIZER=OFF -DUSE_DEBUG=OFF cd ..

Now, if you navigate to the xcode subfolder (in Finder), you should be able to click on a .xcodeproj file and the project will open in XCode.

Owner

  • Name: StochasticTree
  • Login: StochasticTree
  • Kind: organization

GitHub Events

Total
  • Issues event: 24
  • Watch event: 36
  • Delete event: 30
  • Issue comment event: 18
  • Push event: 240
  • Pull request event: 131
  • Fork event: 5
  • Create event: 61
Last Year
  • Issues event: 24
  • Watch event: 36
  • Delete event: 30
  • Issue comment event: 18
  • Push event: 240
  • Pull request event: 131
  • Fork event: 5
  • Create event: 61

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 27
  • Total pull requests: 94
  • Average time to close issues: about 1 month
  • Average time to close pull requests: 5 days
  • Total issue authors: 11
  • Total pull request authors: 4
  • Average comments per issue: 0.56
  • Average comments per pull request: 0.05
  • Merged pull requests: 75
  • Bot issues: 0
  • Bot pull requests: 4
Past Year
  • Issues: 21
  • Pull requests: 78
  • Average time to close issues: 12 days
  • Average time to close pull requests: 1 day
  • Issue authors: 9
  • Pull request authors: 4
  • Average comments per issue: 0.62
  • Average comments per pull request: 0.04
  • Merged pull requests: 61
  • Bot issues: 0
  • Bot pull requests: 4
Top Authors
Issue Authors
  • andrewherren (16)
  • jaredsmurray (2)
  • jdtuck (1)
  • arainboldt (1)
  • marcoBmota8 (1)
  • debajyotid (1)
  • Gattocrucco (1)
  • drizmiz (1)
  • awooddoughty (1)
  • kevinli1324 (1)
  • d-vct (1)
Pull Request Authors
  • andrewherren (88)
  • github-actions[bot] (4)
  • limapsd (1)
  • MichaelChirico (1)
Top Labels
Issue Labels
enhancement (6) bug (3) maintenance (1)
Pull Request Labels
enhancement (1)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 219 last-month
    • cran 474 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 4
  • Total maintainers: 3
pypi.org: stochtree

Stochastic Tree Ensembles for Machine Learning and Causal Inference

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 219 Last month
Rankings
Dependent packages count: 10.7%
Average: 35.4%
Dependent repos count: 60.0%
Maintainers (2)
Last synced: 6 months ago
cran.r-project.org: stochtree

Stochastic Tree Ensembles (XBART and BART) for Supervised Learning and Causal Inference

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 474 Last month
Rankings
Dependent packages count: 27.2%
Dependent repos count: 33.5%
Average: 49.2%
Downloads: 86.8%
Last synced: 6 months ago

Dependencies

R-package/DESCRIPTION cran