Updated 9 months ago

xgboost • Rank 34.6 • Science 64%

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

Updated 9 months ago

hgboost • Rank 12.1 • Science 54%

hgboost is a python package for hyper-parameter optimization for xgboost, catboost or lightboost using cross-validation, and evaluating the results on an independent validation set. hgboost can be applied for classification and regression tasks.

Updated 9 months ago

dtreeviz • Rank 24.2 • Science 36%

A python library for decision tree visualization and model interpretation.

Updated 9 months ago

python-schema-matching • Science 67%

A python tool using XGboost and sentence-transformers to perform schema matching task on tables.

Updated 9 months ago

https://github.com/amr-yasser226/machine-learning-for-network-intrusion-detection • Science 26%

A complete pipeline for network intrusion detection comparing label encoding and one‑hot encoding, with SMOTE resampling, feature selection, and ensemble modeling using scikit‑learn and XGBoost, also this was phase one of our University's "CSAI 253- Machine Learning" course.

Updated 9 months ago

https://github.com/amr-yasser226/intrusion-detection-kaggle • Science 26%

End-to-end pipeline for multi-class cyber-attack detection using per-flow network features: data profiling, deduplication, skew-correction, outlier treatment, feature engineering, imbalance handling, and tree-based modeling (XGBoost, LightGBM, CatBoost, stacking), with a final Kaggle submission scoring 0.9146 public / 0.9163 private.

Updated 9 months ago

https://github.com/ccao-data/report-model-benchmark • Science 13%

Benchmark of timing for CCAO models on different hardware

Updated 9 months ago

robusttrees • Science 28%

[ICML 2019, 20 min long talk] Robust Decision Trees Against Adversarial Examples

Updated 9 months ago

https://github.com/ahmedshahriar/customer-churn-prediction • Science 13%

Extensive EDA of the IBM telco customer churn dataset, implemented various statistical hypotheses tests and Performed single-level Stacking Ensemble and tuned hyperparameters using Optuna.

Updated 9 months ago

https://github.com/cyriljl/apyxl • Science 13%

apyxl simplifies non-linear regressions/classifications and model explainability for all users

Updated 9 months ago

ramanspectrumpredictor_qm9 • Science 26%

Predict Raman spectra of organic molecules with our ML pipeline using RDKit descriptors and a QM9-style dataset. 🌟🔍 Explore the project on GitHub!

Updated 9 months ago

https://github.com/cryogars/uavsar-lidar-ml-project • Science 36%

A project on predicting snow depth using L-Band InSAR parameters.

Updated 9 months ago

go-ml-benchmarks • Science 54%

⏱ Benchmarks of machine learning inference for Go

Updated 9 months ago

crop-yield-estimate • Science 44%

Harness the power of machine learning to forecast rice and wheat crop yields per acre in India, aiming to empower smallholder farmers, combat poverty and malnutrition, utilizing data from Digital Green surveys to revolutionize agriculture and promote sustainable practices in the face of climate change for enhanced global food security.

Updated 9 months ago

supertree • Science 31%

Visualize decision trees in Python