xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
skforecast
Time series forecasting with machine learning models
hgboost
hgboost is a python package for hyper-parameter optimization for xgboost, catboost or lightboost using cross-validation, and evaluating the results on an independent validation set. hgboost can be applied for classification and regression tasks.
mljar-supervised
Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
pyscipopt-ml
Python interface to automatically formulate Machine Learning models into Mixed-Integer Programs
metasklearn
MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models.
dtreeviz
A python library for decision tree visualization and model interpretation.
https://github.com/nixtla/mlforecast
Scalable machine 🤖 learning for time series forecasting.
https://github.com/alibaba/alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
python-schema-matching
A python tool using XGboost and sentence-transformers to perform schema matching task on tables.
https://github.com/amr-yasser226/machine-learning-for-network-intrusion-detection
A complete pipeline for network intrusion detection comparing label encoding and one‑hot encoding, with SMOTE resampling, feature selection, and ensemble modeling using scikit‑learn and XGBoost, also this was phase one of our University's "CSAI 253- Machine Learning" course.
https://github.com/amr-yasser226/intrusion-detection-kaggle
End-to-end pipeline for multi-class cyber-attack detection using per-flow network features: data profiling, deduplication, skew-correction, outlier treatment, feature engineering, imbalance handling, and tree-based modeling (XGBoost, LightGBM, CatBoost, stacking), with a final Kaggle submission scoring 0.9146 public / 0.9163 private.
https://github.com/ccao-data/report-model-benchmark
Benchmark of timing for CCAO models on different hardware
robusttrees
[ICML 2019, 20 min long talk] Robust Decision Trees Against Adversarial Examples
https://github.com/ahmedshahriar/customer-churn-prediction
Extensive EDA of the IBM telco customer churn dataset, implemented various statistical hypotheses tests and Performed single-level Stacking Ensemble and tuned hyperparameters using Optuna.
https://github.com/cyriljl/apyxl
apyxl simplifies non-linear regressions/classifications and model explainability for all users
ramanspectrumpredictor_qm9
Predict Raman spectra of organic molecules with our ML pipeline using RDKit descriptors and a QM9-style dataset. 🌟🔍 Explore the project on GitHub!
https://github.com/cryogars/uavsar-lidar-ml-project
A project on predicting snow depth using L-Band InSAR parameters.
path_based_traffic_flow_prediction
Forecast future traffic flow on a road network.
crop-yield-estimate
Harness the power of machine learning to forecast rice and wheat crop yields per acre in India, aiming to empower smallholder farmers, combat poverty and malnutrition, utilizing data from Digital Green surveys to revolutionize agriculture and promote sustainable practices in the face of climate change for enhanced global food security.