metasklearn
MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models.
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.8%) to scientific vocabulary
Keywords
Repository
MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models.
Basic Info
- Host: GitHub
- Owner: thieu1995
- License: gpl-3.0
- Language: Python
- Default Branch: main
- Homepage: https://metasklearn.readthedocs.io/
- Size: 137 KB
Statistics
- Stars: 5
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 3
Topics
Metadata Files
README.md
MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models.
🌟 Overview
MetaSklearn is a flexible and extensible Python library that brings metaheuristic optimization to
hyperparameter tuning of scikit-learn models. It provides a seamless interface to optimize hyperparameters
using nature-inspired algorithms from the Mealpy library.
It is designed to be user-friendly and efficient, making it easy to integrate into your machine learning workflow.
🚀 Features
- ✅ Hyperparameter optimization by metaheuristic algorithms with
mealpy. - ✅ Compatible with any scikit-learn model (SVM, RandomForest, XGBoost, etc.)
- ✅ Supports classification and regression tasks
- ✅ Custom and scikit-learn scoring support
- ✅ Integration with
PerMetricsfor rich evaluation metrics - ✅ Scikit-learn compatible API:
.fit(),.predict(),.score()
📦 Installation
Install the latest version using pip:
bash
pip install metasklearn
After that, check the version to ensure successful installation:
```sh $ python
import metasklearn metasklearn.version ```
🧠 How It Works
MetaSklearn defines a custom MetaSearchCV class that wraps your model and performs hyperparameter tuning using
any optimizer supported by Mealpy. The framework evaluates model performance using either
scikit-learn’s metrics or additional ones from PerMetrics library.
🚀 Quick Start
📘 Example with SVM model for regression task
```python from sklearn.svm import SVR from sklearn.datasets import load_diabetes from metasklearn import MetaSearchCV, FloatVar, StringVar, Data
Load data object
X, y = loaddiabetes(returnX_y=True) data = Data(X, y)
Split train and test
data.splittraintest(testsize=0.2, randomstate=42, inplace=True) print(data.Xtrain.shape, data.Xtest.shape)
Scaling dataset
data.Xtrain, scalerX = data.scale(data.Xtrain, scalingmethods=("standard", "minmax")) data.Xtest = scalerX.transform(data.X_test)
data.ytrain, scalery = data.scale(data.ytrain, scalingmethods=("standard", "minmax")) data.ytrain = data.ytrain.ravel() data.ytest = scalery.transform(data.y_test.reshape(-1, 1)).ravel()
Define param bounds for SVC
param_bounds = { ==> This is for GridSearchCV, show you how to convert to our MetaSearchCV
"C": [0.1, 100],
"gamma": [1e-4, 1],
"kernel": ["linear", "rbf", "poly"]
}
parambounds = [ FloatVar(lb=0., ub=100., name="C"), FloatVar(lb=1e-4, ub=1., name="gamma"), StringVar(validsets=("linear", "rbf", "poly"), name="kernel") ]
Initialize and fit MetaSearchCV
searcher = MetaSearchCV( estimator=SVR(), parambounds=parambounds, tasktype="regression", optim="BaseGA", optimparams={"epoch": 20, "popsize": 30, "name": "GA"}, cv=3, scoring="MSE", # or any custom scoring like "F1macro" seed=42, njobs=2, verbose=True, mode='single', nworkers=None, termination=None )
searcher.fit(data.Xtrain, data.ytrain) print("Best parameters (Classification):", searcher.bestparams) print("Best model: ", searcher.bestestimator) print("Best score during searching: ", searcher.best_score)
Make prediction after re-fit
ypred = searcher.predict(data.Xtest) print("Test Accuracy:", searcher.score(data.Xtest, data.ytest)) print("Test Score: ", searcher.scores(data.Xtest, data.ytest, list_metrics=("RMSE", "R", "KGE", "NNSE"))) ```
📘 Example with SVM model for classification task
```python from sklearn.svm import SVC from sklearn.datasets import loadiris from sklearn.modelselection import traintestsplit from metasklearn import MetaSearchCV, FloatVar, StringVar
Load dataset
X, y = loadiris(returnXy=True) Xtrain, Xtest, ytrain, ytest = traintestsplit(X, y, testsize=0.2, random_state=42)
Define param bounds for SVC
param_bounds = { ==> This is for GridSearchCV, show you how to convert to our MetaSearchCV
"C": [0.1, 100],
"gamma": [1e-4, 1],
"kernel": ["linear", "rbf", "poly"]
}
parambounds = [ FloatVar(lb=0., ub=100., name="C"), FloatVar(lb=1e-4, ub=1., name="gamma"), StringVar(validsets=("linear", "rbf", "poly"), name="kernel") ]
Initialize and fit MetaSearchCV
searcher = MetaSearchCV( estimator=SVC(), parambounds=parambounds, tasktype="classification", optim="BaseGA", optimparams={"epoch": 20, "popsize": 30, "name": "GA"}, cv=3, scoring="AS", # or any custom scoring like "F1macro" seed=42, njobs=2, verbose=True, mode='single', nworkers=None, termination=None )
searcher.fit(Xtrain, ytrain) print("Best parameters (Classification):", searcher.bestparams) print("Best model: ", searcher.bestestimator) print("Best score during searching: ", searcher.best_score)
Make prediction after re-fit
ypred = searcher.predict(Xtest) print("Test Accuracy:", searcher.score(Xtest, ytest)) print("Test Score: ", searcher.scores(Xtest, ytest, list_metrics=("AS", "RS", "PS", "F1S"))) ```
As can be seen, you do it like any other model from Scikit-Learn library such as Random Forest, Decision Tree, XGBoost,...
📋 Parameters - Variable Types in MetaSearchCV. How to choose them?
This section explains how to use different types of variables from the MetaSearchCV library when defining hyperparameter
search spaces. Each variable type is suitable for different kinds of optimization parameters.
1. IntegerVar – Integer Variable
```python from metasklearn import IntegerVar
var = IntegerVar(lb=1, ub=100, name="n_estimators") ``` Used for discrete numerical parameters like number of neighbors in KNN, number of estimators in ensembles, etc.
2. FloatVar – Float/Continuous Variable
```python from metasklearn import FloatVar
var = FloatVar(lb=0.001, ub=1.0, name="learningrate")
``
Used for continuous numerical parameters such aslearningrate,C,gamma`, etc.
3. StringVar – Categorical/String Variable
```python from metasklearn import StringVar
var = StringVar(valid_sets=("linear", "poly", "rbf"), name="kernel")
``
Used for string parameters with a limited number of choices, e.g.,kernel` in SVM. Value None can be set also.
4. BinaryVar – Binary Variable (0 or 1)
```python from metasklearn import BinaryVar
var = BinaryVar(nvars=1, name="featureselected") ``` Used in binary feature selection problems or any 0/1-based decision.
5. BoolVar – Boolean Variable (True or False)
```python from metasklearn import BoolVar
var = BoolVar(nvars=1, name="usebias")
``
Used for Boolean-type arguments such asfitintercept,usebias`, etc.
6. CategoricalVar - A set of mixed discrete variables such as int, float, string, None
```python from metasklearn import CategoricalVar
var = CategoricalVar(valid_sets=((3., None, "alpha"), (5, 12, 32), ("auto", "exp", "sin")), name="categorical") ```
This type of variable is useful when a hyperparameter can take on a predefined set of mixed values, such as: Mixed types of parameters in optimization tasks (int, string, bool, float,...).
7. SequenceVar - Variables as tuple, list, or set
```python from metasklearn import SequenceVar
var = SequenceVar(validsets=((10, ), (20, 15), (30, 10, 5)), returntype=list, name="hiddenlayersizes") ```
This type of variable is useful for defining hyperparameters that represent sequences, such as the sizes of hidden layers in a neural network.
8. PermutationVar – Permutation Variable
```python from metasklearn import PermutationVar
var = PermutationVar(validset=(1, 2, 5, 10), name="joborder") ``` Used for optimization problems involving permutations, like scheduling or routing.
9. TransferBinaryVar – Transfer Binary Variable
```python from metasklearn import TransferBinaryVar
var = TransferBinaryVar(nvars=1, tffunc="vstf01", lb=-8., ub=8., allzeros=True, name="transfer_binary") ``` Used in binary search spaces that support transformation-based metaheuristics.
10. TransferBoolVar – Transfer Boolean Variable
```python from metasklearn import TransferBoolVar
var = TransferBoolVar(nvars=1, tffunc="vstf01", lb=-8., ub=8., name="transferbool") ``` Used in Boolean search spaces with transferable logic between states.
🔧 Example: Define a Mixed Search Space
```python from metasklearn import (IntegerVar, FloatVar, StringVar, BinaryVar, BoolVar, PermutationVar, CategoricalVar, SequenceVar, TransferBinaryVar, TransferBoolVar)
parambounds = [
IntegerVar(lb=1, ub=20, name="nneighbors"),
FloatVar(lb=0.001, ub=1.0, name="alpha"),
StringVar(validsets=["uniform", "distance"], name="weights"),
BinaryVar(name="usefeature"),
BoolVar(name="fitbias"),
PermutationVar(validset=(1, 2, 5, 10), name="joborder"),
CategoricalVar(validsets=[0.1, "relu", False, None, 3], name="activationchoice"),
SequenceVar(validsets=((10,), (20, 10), (30, 50, 5)), name="mixedchoice"),
TransferBinaryVar(name="bintransfer"),
TransferBoolVar(name="bool_transfer")
]
``
Use this format when designing hyperparameter spaces for advanced models inMetaSearchCV`.
⚙ Supported Optimizers
MetaSklearn integrates all metaheuristic algorithms from Mealpy, including:
- AOA (Arithmetic Optimization Algorithm)
- GWO (Grey Wolf Optimizer)
- PSO (Particle Swarm Optimization)
- DE (Differential Evolution)
- WOA, SSA, MVO, and many more...
You can pass any optimizer name or an instantiated optimizer object to MetaSearchCV. For more details, please refer to the link
📊 Custom Metrics
You can use custom scoring functions from:
sklearn.metrics.getscorernames()
permetrics.RegressionMetric and ClassificationMetric
For details on PerMetrics library, please refer to the link
📚 Documentation
Documentation is available at: 👉 https://metasklearn.readthedocs.io
You can build the documentation locally:
shell
cd docs
make html
🧪 Testing
You can run unit tests using:
shell
pytest tests/
🤝 Contributing
We welcome contributions to MetaSklearn! If you have suggestions, improvements, or bug fixes, feel free to fork
the repository, create a pull request, or open an issue.
📄 License
This project is licensed under the GPLv3 License. See the LICENSE file for more details.
Citation Request
Please include these citations if you plan to use this library:
bibtex
@software{thieu20250510MetaSklearn,
author = {Nguyen Van Thieu},
title = {MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models},
month = June,
year = 2025,
doi = {10.6084/m9.figshare.28978805},
url = {https://github.com/thieu1995/MetaSklearn}
}
Official Links
- Official source code repo: https://github.com/thieu1995/MetaSklearn
- Official document: https://metasklearn.readthedocs.io/
- Download releases: https://pypi.org/project/metasklearn/
- Issue tracker: https://github.com/thieu1995/MetaSklearn/issues
- Notable changes log: https://github.com/thieu1995/MetaSklearn/blob/master/ChangeLog.md
- Official chat group: https://t.me/+fRVCJGuGJg1mNDg1
Developed by: Thieu @ 2025
Owner
- Name: Nguyen Van Thieu
- Login: thieu1995
- Kind: user
- Location: Earth
- Company: AIIR Group
- Website: https://thieu1995.github.io/
- Repositories: 13
- Profile: https://github.com/thieu1995
Knowledge is power, sharing it is the premise of progress in life. It seems like a burden to someone, but it is the only way to achieve immortality.
Citation (CITATION.cff)
cff-version: 1.1.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Van Thieu"
given-names: "Nguyen"
orcid: "https://orcid.org/0000-0001-9994-8747"
title: "MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models"
version: v0.3.0
doi: 10.6084/m9.figshare.28978805
date-released: 2025-06-04
url: "https://github.com/thieu1995/MetaSklearn"
GitHub Events
Total
- Release event: 6
- Watch event: 5
- Delete event: 3
- Member event: 3
- Push event: 26
- Create event: 8
Last Year
- Release event: 6
- Watch event: 5
- Delete event: 3
- Member event: 3
- Push event: 26
- Create event: 8
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Thieu Nguyen | n****2@g****m | 38 |
| anh9895 | h****5@g****m | 36 |
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 37 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 3
- Total maintainers: 1
pypi.org: metasklearn
MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models
- Homepage: https://github.com/thieu1995/MetaSklearn
- Documentation: https://metasklearn.readthedocs.io/
- License: GPLv3
-
Latest release: 0.3.0
published 7 months ago