sdtf

Exploring streaming options for decision trees and random forests. Based on scikit-learn fork.

https://github.com/neurodata/sdtf

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, zenodo.org
  • Committers with academic emails
    1 of 3 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.3%) to scientific vocabulary

Keywords

classification decision-trees machine-learning streaming-data
Last synced: 4 months ago · JSON representation ·

Repository

Exploring streaming options for decision trees and random forests. Based on scikit-learn fork.

Basic Info
  • Host: GitHub
  • Owner: neurodata
  • License: other
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage: https://sdtf.neurodata.io
  • Size: 120 MB
Statistics
  • Stars: 9
  • Watchers: 2
  • Forks: 2
  • Open Issues: 9
  • Releases: 11
Topics
classification decision-trees machine-learning streaming-data
Created over 5 years ago · Last pushed 5 months ago
Metadata Files
Readme License Citation

README.md

Streaming Decision Trees & Forests

arXiv DOI PyPI version CircleCI Netlify Python Code style: black Downloads

SDTF (Streaming Decision Trees & Forests) is for exploring streaming options of decision trees and random forests.

The package includes two ensemble implementations (Stream Decision Forest and Cascade Stream Forest).

Based on scikit-learn fork.

  • Documentation: https://sdtf.neurodata.io/
  • Installation Guide: https://sdtf.neurodata.io/#install
  • Benchmark Visual: https://sdtf.neurodata.io/visual.html
  • API Reference: https://sdtf.neurodata.io/api.html

Owner

  • Name: neurodata
  • Login: neurodata
  • Kind: organization
  • Email: admin@neurodata.io
  • Location: everywhere

Citation (CITATION.cff)

# YAML 1.2
---
authors:
  -
    affiliation: "Johns Hopkins University, Baltimore, MD"
    family-names: Xu
    given-names: Haoyin
    orcid: "https://orcid.org/0000-0001-8235-4950"
  -
    affiliation: "Johns Hopkins University, Baltimore, MD"
    family-names: Dey
    given-names: Jayanta
  -
    affiliation: "Johns Hopkins University, Baltimore, MD"
    family-names: Panda
    given-names: Sambit
    orcid: "https://orcid.org/0000-0001-8455-4243"
  -
    affiliation: "Johns Hopkins University, Baltimore, MD"
    family-names: Vogelstein
    given-names: Joshua
    orcid: "https://orcid.org/0000-0003-2487-6237"
identifiers:
  -
    type: url
    value: "https://arxiv.org/pdf/2110.08483.pdf"
  - type: doi
    value: 10.5281/zenodo.5557864
date-released: 2022-3-1
keywords:
  - "Streaming Trees"
  - "Decision Trees"
  - "Machine Learning"
cff-version: "1.2.0"
license: MIT
doi: 10.5281/zenodo.5557864
message: "If you use SDTF, please cite it using these metadata."
repository-code: "https://github.com/neurodata/SDTF"
title: "Streaming Decision Trees & Forests"
version: "0.1.1"
...

GitHub Events

Total
  • Watch event: 1
  • Push event: 2
Last Year
  • Watch event: 1
  • Push event: 2

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 234
  • Total Committers: 3
  • Avg Commits per committer: 78.0
  • Development Distribution Score (DDS): 0.111
Top Committers
Name Email Commits
Haoyin Xu h****u@g****m 208
Nick Hahn 8****7@u****m 14
Haoyin Xu h****2@u****u 12
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: about 1 year ago

All Time
  • Total issues: 26
  • Total pull requests: 31
  • Average time to close issues: about 1 month
  • Average time to close pull requests: 3 days
  • Total issue authors: 4
  • Total pull request authors: 2
  • Average comments per issue: 0.27
  • Average comments per pull request: 0.61
  • Merged pull requests: 31
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 minute
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 1.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • PSSF23 (23)
  • andrewcheng2016 (1)
  • nhahn7 (1)
  • chrsunwil (1)
Pull Request Authors
  • PSSF23 (29)
  • nhahn7 (3)
Top Labels
Issue Labels
enhancement (17) documentation (3) question (3) ndd (2) bug (1)
Pull Request Labels
enhancement (22) documentation (17) bug (2) draft (1)

Dependencies

benchmarks/requirements.txt pypi
  • river *
  • scikit-garden *
  • sdtf *
  • torchvision *
dev-requirements.txt pypi
  • black * development
  • cython * development
  • openml * development
  • psutil * development
  • river * development
  • twine * development
  • wheel * development
docs/requirements.txt pypi
  • nbsphinx *
  • numpydoc *
  • sdtf *
  • sphinx *
  • sphinx_rtd_theme *
requirements.txt pypi
  • matplotlib *
  • numpy *
  • scikit-learn *
  • scipy *
  • seaborn *