yggdrasil-decision-forests
A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 6 DOI reference(s) in README -
○Academic publication links
-
✓Committers with academic emails
1 of 35 committers (2.9%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.4%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.
Basic Info
- Host: GitHub
- Owner: google
- License: apache-2.0
- Language: C++
- Default Branch: main
- Homepage: https://ydf.readthedocs.io/
- Size: 45.2 MB
Statistics
- Stars: 603
- Watchers: 16
- Forks: 65
- Open Issues: 43
- Releases: 37
Topics
Metadata Files
README.md
YDF (Yggdrasil Decision Forests) is a library to train, evaluate, interpret, and serve Random Forest, Gradient Boosted Decision Trees, CART and Isolation forest models.
See the documentation for more information on YDF.
Installation
To install YDF from PyPI, run:
shell
pip install ydf -U
Usage example
```python import ydf import pandas as pd
Load dataset with Pandas
dspath = "https://raw.githubusercontent.com/google/yggdrasil-decision-forests/main/yggdrasildecisionforests/testdata/dataset/" trainds = pd.readcsv(dspath + "adulttrain.csv") testds = pd.readcsv(dspath + "adulttest.csv")
Train a Gradient Boosted Trees model
model = ydf.GradientBoostedTreesLearner(label="income").train(train_ds)
Look at a model (input features, training logs, structure, etc.)
model.describe()
Evaluate a model (e.g. roc, accuracy, confusion matrix, confidence intervals)
model.evaluate(test_ds)
Generate predictions
model.predict(test_ds)
Analyse a model (e.g. partial dependence plot, variable importance)
model.analyze(test_ds)
Benchmark the inference speed of a model
model.benchmark(test_ds)
Save the model
model.save("/tmp/my_model") ```
Example with the C++ API.
```c++ auto dataset_path = "csv:train.csv";
// List columns in training dataset DataSpecification spec; CreateDataSpec(dataset_path, false, {}, &spec);
// Create a training configuration TrainingConfig trainconfig; trainconfig.setlearner("RANDOMFOREST"); trainconfig.settask(Task::CLASSIFICATION); trainconfig.setlabel("my_label");
// Train model
std::uniqueptr
// Export model SaveModel("my_model", model.get()); ```
(based on examples/beginner.cc)
Next steps
Check the Getting Started tutorial 🧭.
Citation
If you us Yggdrasil Decision Forests in a scientific publication, please cite the following paper: Yggdrasil Decision Forests: A Fast and Extensible Decision Forests Library.
Bibtex
@inproceedings{GBBSP23,
author = {Mathieu Guillame{-}Bert and
Sebastian Bruch and
Richard Stotz and
Jan Pfeifer},
title = {Yggdrasil Decision Forests: {A} Fast and Extensible Decision Forests
Library},
booktitle = {Proceedings of the 29th {ACM} {SIGKDD} Conference on Knowledge Discovery
and Data Mining, {KDD} 2023, Long Beach, CA, USA, August 6-10, 2023},
pages = {4068--4077},
year = {2023},
url = {https://doi.org/10.1145/3580305.3599933},
doi = {10.1145/3580305.3599933},
}
Raw
Yggdrasil Decision Forests: A Fast and Extensible Decision Forests Library, Guillame-Bert et al., KDD 2023: 4068-4077. doi:10.1145/3580305.3599933
Contact
You can contact the core development team at decision-forests-contact@google.com.
Credits
Yggdrasil Decision Forests and TensorFlow Decision Forests are developed by:
- Mathieu Guillame-Bert (gbm AT google DOT com)
- Richard Stotz (richardstotz AT google DOT com)
- Jan Pfeifer (janpf AT google DOT com)
- Sebastian Bruch (sebastian AT bruch DOT io)
- Arvind Srinivasan (arvnd AT google DOT com)
Contributing
Contributions to TensorFlow Decision Forests and Yggdrasil Decision Forests are welcome. If you want to contribute, check the contribution guidelines.
License
Owner
- Name: Google
- Login: google
- Kind: organization
- Email: opensource@google.com
- Location: United States of America
- Website: https://opensource.google/
- Twitter: GoogleOSS
- Repositories: 2,773
- Profile: https://github.com/google
Google ❤️ Open Source
GitHub Events
Total
- Create event: 9
- Release event: 5
- Issues event: 63
- Watch event: 118
- Delete event: 2
- Issue comment event: 105
- Push event: 230
- Pull request review event: 11
- Pull request review comment event: 13
- Pull request event: 31
- Fork event: 18
Last Year
- Create event: 9
- Release event: 5
- Issues event: 63
- Watch event: 118
- Delete event: 2
- Issue comment event: 105
- Push event: 230
- Pull request review event: 11
- Pull request review comment event: 13
- Pull request event: 31
- Fork event: 18
Committers
Last synced: 6 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Mathieu Guillame-Bert | g****m@g****m | 680 |
| Richard Stotz | r****z@g****m | 460 |
| TensorFlow Decision Forests Team | n****y@g****m | 58 |
| Damiano Amatruda | d****a@g****m | 9 |
| Jan Pfeifer | j****f@g****m | 8 |
| Arvind Srinivasan | a****d@g****m | 3 |
| Ariel Lubonja | a****l@c****u | 3 |
| Dmitry Tsarkov | t****r@g****m | 3 |
| Yggdrasil Decision Forests Team | d****t@g****m | 2 |
| Jake VanderPlas | v****s@g****m | 2 |
| Alejandro Barrachina Argudo | 4****2 | 2 |
| Emmanuel Ferdman | e****n@g****m | 2 |
| Howard Chiam | h****m | 2 |
| Ivo Ristovski List | i****t@g****m | 2 |
| Jean-Baptiste Lespiau | j****u@g****m | 2 |
| Peter Hawkins | p****s@g****m | 2 |
| Bogdan Graur | b****r@g****m | 1 |
| Alejandro Cruzado-Ruiz | l****m@g****m | 1 |
| Alex | b****z | 1 |
| Arno Eigenwillig | a****w@g****m | 1 |
| Chris Kennelly | c****y@g****m | 1 |
| David Dunleavy | d****y@g****m | 1 |
| Florian Mayer | f****r@g****m | 1 |
| Hana Joo | h****o@g****m | 1 |
| John Cater | j****r@g****m | 1 |
| John QiangZhang | j****g@g****m | 1 |
| Laramie Leavitt | l****r@g****m | 1 |
| Matthew Soulanille | m****w@s****t | 1 |
| Mehdi Amini | a****m@g****m | 1 |
| Michelangelo Conserva | m****a@g****m | 1 |
| and 5 more... | ||
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 133
- Total pull requests: 55
- Average time to close issues: 2 months
- Average time to close pull requests: 10 days
- Total issue authors: 76
- Total pull request authors: 20
- Average comments per issue: 3.08
- Average comments per pull request: 0.93
- Merged pull requests: 35
- Bot issues: 0
- Bot pull requests: 3
Past Year
- Issues: 54
- Pull requests: 30
- Average time to close issues: 16 days
- Average time to close pull requests: 5 days
- Issue authors: 42
- Pull request authors: 10
- Average comments per issue: 1.41
- Average comments per pull request: 1.13
- Merged pull requests: 16
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- JoseAF (9)
- Arnold1 (6)
- CodingDoug (5)
- lusis-ai (5)
- andrea-cassioli-maersk (5)
- TonyCongqianWang (4)
- jimidle (4)
- rlcauvin (4)
- alpetukhov (3)
- marquisthunder (3)
- stephen-up (3)
- salamanders (3)
- omit-ai (3)
- PSSF23 (3)
- patrickjedlicka (3)
Pull Request Authors
- achoum (18)
- rstz (13)
- dependabot[bot] (4)
- YueWan1 (3)
- ariellubonja (3)
- hchiam (3)
- emmanuel-ferdman (3)
- ALK222 (2)
- fmayer (2)
- copybara-service[bot] (1)
- Neutrovertido (1)
- janpfeifer (1)
- stephen-up (1)
- Willian-Zhang (1)
- LarytheLord (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 5
-
Total downloads:
- npm 8,171 last-month
- pypi 91,609 last-month
-
Total dependent packages: 0
(may contain duplicates) -
Total dependent repositories: 1
(may contain duplicates) - Total versions: 84
- Total maintainers: 4
proxy.golang.org: github.com/google/yggdrasil-decision-forests/yggdrasil_decision_forests/port/go
- Homepage: https://github.com/google/yggdrasil-decision-forests
- Documentation: https://pkg.go.dev/github.com/google/yggdrasil-decision-forests/yggdrasil_decision_forests/port/go#section-documentation
- License: Apache-2.0, MIT
-
Latest release: v1.5.0
published over 2 years ago
Rankings
pypi.org: ydf
YDF (short for Yggdrasil Decision Forests) is a library for training, serving, evaluating and analyzing decision forest models such as Random Forest and Gradient Boosted Trees.
- Homepage: https://github.com/google/yggdrasil-decision-forests
- Documentation: https://ydf.readthedocs.io/
- License: Apache 2.0
-
Latest release: 0.13.0
published 7 months ago
Rankings
Maintainers (2)
npmjs.org: ydf-training
Training YDF models in Javascript.
- Homepage: https://ydf.readthedocs.io
- License: Apache-2.0
-
Latest release: 0.0.1
published over 1 year ago
Rankings
npmjs.org: yggdrasil-decision-forests
With this package, you can generate predictions of machine learning models trained with YDF in browser and with NodeJS.
- Homepage: https://ydf.readthedocs.io
- License: Apache-2.0
-
Latest release: 0.0.3
published over 1 year ago
Rankings
Maintainers (1)
npmjs.org: ydf-inference
With this package, you can generate predictions of machine learning models trained with YDF in browser and with NodeJS.
- Homepage: https://ydf.readthedocs.io
- License: Apache-2.0
-
Latest release: 0.0.4
published over 1 year ago
Rankings
Maintainers (1)
Dependencies
- github.com/google/go-cmp v0.5.8
- google.golang.org/protobuf v1.28.1
- github.com/golang/protobuf v1.5.0
- github.com/google/go-cmp v0.5.5
- github.com/google/go-cmp v0.5.8
- golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543
- google.golang.org/protobuf v1.26.0-rc.1
- google.golang.org/protobuf v1.28.1
- myst-parser *
- readthedocs-sphinx-search ==0.1.1
- sphinx ==4.2.0
- sphinx-autoapi *
- sphinx-autodoc-typehints *
- sphinx-book-theme >=0.3.3
- sphinx-copybutton >=0.5.0
- sphinx-remove-toctrees *
- sphinx-sitemap *
- sphinx_design *
- sphinx_rtd_theme ==1.0.0