stochtree
Stochastic tree ensembles (BART / XBART) for supervised learning and causal inference
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (17.7%) to scientific vocabulary
Keywords
Repository
Stochastic tree ensembles (BART / XBART) for supervised learning and causal inference
Basic Info
- Host: GitHub
- Owner: StochasticTree
- License: other
- Language: C++
- Default Branch: main
- Homepage: https://stochtree.ai/
- Size: 39.6 MB
Statistics
- Stars: 53
- Watchers: 4
- Forks: 14
- Open Issues: 18
- Releases: 0
Topics
Metadata Files
README.md
StochTree
Software for building stochastic tree ensembles (i.e. BART, XBART) for supervised learning and causal inference.
Getting Started
stochtree is composed of a C++ "core" and R / Python interfaces to that core.
Details on installation and use are available below:
Python Package
The python package is not yet on PyPI but can be installed from source using pip's git interface. To proceed, you will need a working version of git and python 3.8 or greater (available from several sources, one of the most straightforward being the anaconda suite).
Quick start
Without worrying about virtual environments (detailed further below), stochtree can be installed from the command line
pip install numpy scipy pytest pandas scikit-learn pybind11
pip install git+https://github.com/StochasticTree/stochtree.git
Virtual environment installation
Often, users prefer to manage different projects (with different package / python version requirements) in virtual environments.
Conda
Conda provides a straightforward experience in managing python dependencies, avoiding version conflicts / ABI issues / etc.
To build stochtree using a conda based workflow, first create and activate a conda environment with the requisite dependencies
{bash}
conda create -n stochtree-dev -c conda-forge python=3.10 numpy scipy pytest pandas pybind11 scikit-learn matplotlib seaborn
conda activate stochtree-dev
Then install the package from github via pip
{bash}
pip install git+https://github.com/StochasticTree/stochtree.git
(Note: if you'd also like to run stochtree's notebook examples, you will also need jupyterlab, seaborn, and matplotlib)
{bash}
conda install matplotlib seaborn
pip install jupyterlab
With these dependencies installed, you can clone the repo and run the demo/ examples.
Venv
You could also use venv for environment management. First, navigate to the folder in which you usually store virtual environments
(i.e. cd /path/to/envs) and create and activate a virtual environment:
{bash}
python -m venv venv
source venv/bin/activate
Install all of the package (and demo notebook) dependencies
{bash}
pip install numpy scipy pytest pandas scikit-learn pybind11
Then install stochtree via
{bash}
pip install git+https://github.com/StochasticTree/stochtree.git
As above, if you'd like to run the notebook examples in the demo/ subfolder, you will also need jupyterlab, seaborn, and matplotlib and you will have to clone the repo
{bash}
pip install matplotlib seaborn jupyterlab
R Package
The R package can be installed from CRAN via
install.packages("stochtree")
The development version of stochtree can be installed from Github via
remotes::install_github("StochasticTree/stochtree", ref="r-dev")
C++ Core
While the C++ core links to both R and Python for a performant, high-level interface, the C++ code can be compiled and unit-tested and compiled into a standalone debug program.
Compilation
Cloning the Repository
To clone the repository, you must have git installed, which you can do following these instructions.
Once git is available at the command line, navigate to the folder that will store this project (in bash / zsh, this is done by running cd followed by the path to the directory).
Then, clone the stochtree repo as a subfolder by running
{bash}
git clone --recursive https://github.com/StochasticTree/stochtree.git
NOTE: this project incorporates several dependencies as git submodules,
which is why the --recursive flag is necessary (some systems may perform a recursive clone without this flag, but
--recursive ensures this behavior on all platforms). If you have already cloned the repo without the --recursive flag,
you can retrieve the submodules recursively by running git submodule update --init --recursive in the main repo directory.
CMake Build
The C++ project can be built independently from the R / Python packages using cmake.
See here for details on installing cmake (alternatively,
on MacOS, cmake can be installed using homebrew).
Once cmake is installed, you can build the CLI by navigating to the main
project directory at your command line (i.e. cd /path/to/stochtree) and
running the following code
{bash}
rm -rf build
mkdir build
cmake -S . -B build
cmake --build build
The CMake build has two primary targets, which are detailed below
Debug Program
debug/api_debug.cpp defines a standalone target that can be straightforwardly run with a debugger (i.e. lldb, gdb)
while making non-trivial changes to the C++ code.
This debugging program is compiled as part of the CMake build if the BUILD_DEBUG_TARGETS option in CMakeLists.txt is set to ON.
Once the program has been built, it can be run from the command line via ./build/debugstochtree or attached to a debugger
via lldb ./build/debugstochtree (clang) or gdb ./build/debugstochtree (gcc).
Unit Tests
We test stochtree using the GoogleTest framework.
Unit tests are compiled into a single target as part of the CMake build if the BUILD_TEST option is set to ON
and the test suite can be run after compilation via ./build/teststochtree
Xcode
While using gdb or lldb on debugstochtree at the command line is very helpful, users may prefer debugging in a full-fledged IDE like xcode. This project's C++ core can be converted to an xcode project from CMakeLists.txt, but first you must turn off sanitizers (xcode seems to have its own way of setting this at build time for different configurations, and having injected
-fsanitize=address statically into compiler arguments will cause xcode errors). To do this, modify the USE_SANITIZER line in CMakeLists.txt:
option(USE_SANITIZER "Use santizer flags" OFF)
To generate an XCode project based on the build targets and specifications defined in a CMakeLists.txt, navigate to the main project folder (i.e. cd /path/to/project) and run the following commands:
{bash}
rm -rf xcode/
mkdir xcode
cd xcode
cmake -G Xcode .. -DCMAKE_C_COMPILER=cc -DCMAKE_CXX_COMPILER=c++ -DUSE_SANITIZER=OFF -DUSE_DEBUG=OFF
cd ..
Now, if you navigate to the xcode subfolder (in Finder), you should be able to click on a .xcodeproj file and the project will open in XCode.
Owner
- Name: StochasticTree
- Login: StochasticTree
- Kind: organization
- Repositories: 2
- Profile: https://github.com/StochasticTree
GitHub Events
Total
- Issues event: 24
- Watch event: 36
- Delete event: 30
- Issue comment event: 18
- Push event: 240
- Pull request event: 131
- Fork event: 5
- Create event: 61
Last Year
- Issues event: 24
- Watch event: 36
- Delete event: 30
- Issue comment event: 18
- Push event: 240
- Pull request event: 131
- Fork event: 5
- Create event: 61
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 27
- Total pull requests: 94
- Average time to close issues: about 1 month
- Average time to close pull requests: 5 days
- Total issue authors: 11
- Total pull request authors: 4
- Average comments per issue: 0.56
- Average comments per pull request: 0.05
- Merged pull requests: 75
- Bot issues: 0
- Bot pull requests: 4
Past Year
- Issues: 21
- Pull requests: 78
- Average time to close issues: 12 days
- Average time to close pull requests: 1 day
- Issue authors: 9
- Pull request authors: 4
- Average comments per issue: 0.62
- Average comments per pull request: 0.04
- Merged pull requests: 61
- Bot issues: 0
- Bot pull requests: 4
Top Authors
Issue Authors
- andrewherren (16)
- jaredsmurray (2)
- jdtuck (1)
- arainboldt (1)
- marcoBmota8 (1)
- debajyotid (1)
- Gattocrucco (1)
- drizmiz (1)
- awooddoughty (1)
- kevinli1324 (1)
- d-vct (1)
Pull Request Authors
- andrewherren (88)
- github-actions[bot] (4)
- limapsd (1)
- MichaelChirico (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 2
-
Total downloads:
- pypi 219 last-month
- cran 474 last-month
-
Total dependent packages: 0
(may contain duplicates) -
Total dependent repositories: 0
(may contain duplicates) - Total versions: 4
- Total maintainers: 3
pypi.org: stochtree
Stochastic Tree Ensembles for Machine Learning and Causal Inference
- Homepage: https://stochtree.ai/
- Documentation: https://stochtree.ai/python_docs/index.html
- License: MIT
-
Latest release: 0.1.0
published 11 months ago
Rankings
Maintainers (2)
cran.r-project.org: stochtree
Stochastic Tree Ensembles (XBART and BART) for Supervised Learning and Causal Inference
- Homepage: https://stochtree.ai/
- Documentation: http://cran.r-project.org/web/packages/stochtree/stochtree.pdf
- License: MIT + file LICENSE
-
Latest release: 0.1.1
published about 1 year ago