metasklearn

MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models.

https://github.com/thieu1995/metasklearn

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.8%) to scientific vocabulary

Keywords

adaboost bayesian-optimization decision-tree gridsearchcv hyperparameter-optimization hyperparameter-tuning knn metaheuristic-search nature-inspired-algorithms random-forest randomized-search scikit-compatible scikit-learn svm xgboost
Last synced: 4 months ago · JSON representation ·

Repository

MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models.

Basic Info
Statistics
  • Stars: 5
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 3
Topics
adaboost bayesian-optimization decision-tree gridsearchcv hyperparameter-optimization hyperparameter-tuning knn metaheuristic-search nature-inspired-algorithms random-forest randomized-search scikit-compatible scikit-learn svm xgboost
Created 8 months ago · Last pushed 7 months ago
Metadata Files
Readme Changelog License Code of conduct Citation

README.md

MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models.

GitHub release PyPI version PyPI - Python Version PyPI - Downloads Downloads Tests & Publishes to PyPI Documentation Status Chat DOI License: GPL v3


🌟 Overview

MetaSklearn is a flexible and extensible Python library that brings metaheuristic optimization to hyperparameter tuning of scikit-learn models. It provides a seamless interface to optimize hyperparameters using nature-inspired algorithms from the Mealpy library. It is designed to be user-friendly and efficient, making it easy to integrate into your machine learning workflow.

🚀 Features

  • ✅ Hyperparameter optimization by metaheuristic algorithms with mealpy.
  • ✅ Compatible with any scikit-learn model (SVM, RandomForest, XGBoost, etc.)
  • ✅ Supports classification and regression tasks
  • ✅ Custom and scikit-learn scoring support
  • ✅ Integration with PerMetrics for rich evaluation metrics
  • ✅ Scikit-learn compatible API: .fit(), .predict(), .score()

📦 Installation

Install the latest version using pip:

bash pip install metasklearn

After that, check the version to ensure successful installation:

```sh $ python

import metasklearn metasklearn.version ```

🧠 How It Works

MetaSklearn defines a custom MetaSearchCV class that wraps your model and performs hyperparameter tuning using any optimizer supported by Mealpy. The framework evaluates model performance using either scikit-learn’s metrics or additional ones from PerMetrics library.

🚀 Quick Start

📘 Example with SVM model for regression task

```python from sklearn.svm import SVR from sklearn.datasets import load_diabetes from metasklearn import MetaSearchCV, FloatVar, StringVar, Data

Load data object

X, y = loaddiabetes(returnX_y=True) data = Data(X, y)

Split train and test

data.splittraintest(testsize=0.2, randomstate=42, inplace=True) print(data.Xtrain.shape, data.Xtest.shape)

Scaling dataset

data.Xtrain, scalerX = data.scale(data.Xtrain, scalingmethods=("standard", "minmax")) data.Xtest = scalerX.transform(data.X_test)

data.ytrain, scalery = data.scale(data.ytrain, scalingmethods=("standard", "minmax")) data.ytrain = data.ytrain.ravel() data.ytest = scalery.transform(data.y_test.reshape(-1, 1)).ravel()

Define param bounds for SVC

param_bounds = { ==> This is for GridSearchCV, show you how to convert to our MetaSearchCV

"C": [0.1, 100],

"gamma": [1e-4, 1],

"kernel": ["linear", "rbf", "poly"]

}

parambounds = [ FloatVar(lb=0., ub=100., name="C"), FloatVar(lb=1e-4, ub=1., name="gamma"), StringVar(validsets=("linear", "rbf", "poly"), name="kernel") ]

Initialize and fit MetaSearchCV

searcher = MetaSearchCV( estimator=SVR(), parambounds=parambounds, tasktype="regression", optim="BaseGA", optimparams={"epoch": 20, "popsize": 30, "name": "GA"}, cv=3, scoring="MSE", # or any custom scoring like "F1macro" seed=42, njobs=2, verbose=True, mode='single', nworkers=None, termination=None )

searcher.fit(data.Xtrain, data.ytrain) print("Best parameters (Classification):", searcher.bestparams) print("Best model: ", searcher.bestestimator) print("Best score during searching: ", searcher.best_score)

Make prediction after re-fit

ypred = searcher.predict(data.Xtest) print("Test Accuracy:", searcher.score(data.Xtest, data.ytest)) print("Test Score: ", searcher.scores(data.Xtest, data.ytest, list_metrics=("RMSE", "R", "KGE", "NNSE"))) ```

📘 Example with SVM model for classification task

```python from sklearn.svm import SVC from sklearn.datasets import loadiris from sklearn.modelselection import traintestsplit from metasklearn import MetaSearchCV, FloatVar, StringVar

Load dataset

X, y = loadiris(returnXy=True) Xtrain, Xtest, ytrain, ytest = traintestsplit(X, y, testsize=0.2, random_state=42)

Define param bounds for SVC

param_bounds = { ==> This is for GridSearchCV, show you how to convert to our MetaSearchCV

"C": [0.1, 100],

"gamma": [1e-4, 1],

"kernel": ["linear", "rbf", "poly"]

}

parambounds = [ FloatVar(lb=0., ub=100., name="C"), FloatVar(lb=1e-4, ub=1., name="gamma"), StringVar(validsets=("linear", "rbf", "poly"), name="kernel") ]

Initialize and fit MetaSearchCV

searcher = MetaSearchCV( estimator=SVC(), parambounds=parambounds, tasktype="classification", optim="BaseGA", optimparams={"epoch": 20, "popsize": 30, "name": "GA"}, cv=3, scoring="AS", # or any custom scoring like "F1macro" seed=42, njobs=2, verbose=True, mode='single', nworkers=None, termination=None )

searcher.fit(Xtrain, ytrain) print("Best parameters (Classification):", searcher.bestparams) print("Best model: ", searcher.bestestimator) print("Best score during searching: ", searcher.best_score)

Make prediction after re-fit

ypred = searcher.predict(Xtest) print("Test Accuracy:", searcher.score(Xtest, ytest)) print("Test Score: ", searcher.scores(Xtest, ytest, list_metrics=("AS", "RS", "PS", "F1S"))) ```

As can be seen, you do it like any other model from Scikit-Learn library such as Random Forest, Decision Tree, XGBoost,...

📋 Parameters - Variable Types in MetaSearchCV. How to choose them?

This section explains how to use different types of variables from the MetaSearchCV library when defining hyperparameter search spaces. Each variable type is suitable for different kinds of optimization parameters.

1. IntegerVar – Integer Variable

```python from metasklearn import IntegerVar

var = IntegerVar(lb=1, ub=100, name="n_estimators") ``` Used for discrete numerical parameters like number of neighbors in KNN, number of estimators in ensembles, etc.

2. FloatVar – Float/Continuous Variable

```python from metasklearn import FloatVar

var = FloatVar(lb=0.001, ub=1.0, name="learningrate") `` Used for continuous numerical parameters such aslearningrate,C,gamma`, etc.

3. StringVar – Categorical/String Variable

```python from metasklearn import StringVar

var = StringVar(valid_sets=("linear", "poly", "rbf"), name="kernel") `` Used for string parameters with a limited number of choices, e.g.,kernel` in SVM. Value None can be set also.

4. BinaryVar – Binary Variable (0 or 1)

```python from metasklearn import BinaryVar

var = BinaryVar(nvars=1, name="featureselected") ``` Used in binary feature selection problems or any 0/1-based decision.

5. BoolVar – Boolean Variable (True or False)

```python from metasklearn import BoolVar

var = BoolVar(nvars=1, name="usebias") `` Used for Boolean-type arguments such asfitintercept,usebias`, etc.

6. CategoricalVar - A set of mixed discrete variables such as int, float, string, None

```python from metasklearn import CategoricalVar

var = CategoricalVar(valid_sets=((3., None, "alpha"), (5, 12, 32), ("auto", "exp", "sin")), name="categorical") ```

This type of variable is useful when a hyperparameter can take on a predefined set of mixed values, such as: Mixed types of parameters in optimization tasks (int, string, bool, float,...).

7. SequenceVar - Variables as tuple, list, or set

```python from metasklearn import SequenceVar

var = SequenceVar(validsets=((10, ), (20, 15), (30, 10, 5)), returntype=list, name="hiddenlayersizes") ```

This type of variable is useful for defining hyperparameters that represent sequences, such as the sizes of hidden layers in a neural network.

8. PermutationVar – Permutation Variable

```python from metasklearn import PermutationVar

var = PermutationVar(validset=(1, 2, 5, 10), name="joborder") ``` Used for optimization problems involving permutations, like scheduling or routing.

9. TransferBinaryVar – Transfer Binary Variable

```python from metasklearn import TransferBinaryVar

var = TransferBinaryVar(nvars=1, tffunc="vstf01", lb=-8., ub=8., allzeros=True, name="transfer_binary") ``` Used in binary search spaces that support transformation-based metaheuristics.

10. TransferBoolVar – Transfer Boolean Variable

```python from metasklearn import TransferBoolVar

var = TransferBoolVar(nvars=1, tffunc="vstf01", lb=-8., ub=8., name="transferbool") ``` Used in Boolean search spaces with transferable logic between states.

🔧 Example: Define a Mixed Search Space

```python from metasklearn import (IntegerVar, FloatVar, StringVar, BinaryVar, BoolVar, PermutationVar, CategoricalVar, SequenceVar, TransferBinaryVar, TransferBoolVar)

parambounds = [ IntegerVar(lb=1, ub=20, name="nneighbors"), FloatVar(lb=0.001, ub=1.0, name="alpha"), StringVar(validsets=["uniform", "distance"], name="weights"), BinaryVar(name="usefeature"), BoolVar(name="fitbias"), PermutationVar(validset=(1, 2, 5, 10), name="joborder"), CategoricalVar(validsets=[0.1, "relu", False, None, 3], name="activationchoice"), SequenceVar(validsets=((10,), (20, 10), (30, 50, 5)), name="mixedchoice"), TransferBinaryVar(name="bintransfer"), TransferBoolVar(name="bool_transfer") ] `` Use this format when designing hyperparameter spaces for advanced models inMetaSearchCV`.

⚙ Supported Optimizers

MetaSklearn integrates all metaheuristic algorithms from Mealpy, including:

  • AOA (Arithmetic Optimization Algorithm)
  • GWO (Grey Wolf Optimizer)
  • PSO (Particle Swarm Optimization)
  • DE (Differential Evolution)
  • WOA, SSA, MVO, and many more...

You can pass any optimizer name or an instantiated optimizer object to MetaSearchCV. For more details, please refer to the link

📊 Custom Metrics

You can use custom scoring functions from:

  • sklearn.metrics.getscorernames()

  • permetrics.RegressionMetric and ClassificationMetric

For details on PerMetrics library, please refer to the link

📚 Documentation

Documentation is available at: 👉 https://metasklearn.readthedocs.io

You can build the documentation locally:

shell cd docs make html

🧪 Testing

You can run unit tests using:

shell pytest tests/

🤝 Contributing

We welcome contributions to MetaSklearn! If you have suggestions, improvements, or bug fixes, feel free to fork the repository, create a pull request, or open an issue.

📄 License

This project is licensed under the GPLv3 License. See the LICENSE file for more details.

Citation Request

Please include these citations if you plan to use this library:

bibtex @software{thieu20250510MetaSklearn, author = {Nguyen Van Thieu}, title = {MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models}, month = June, year = 2025, doi = {10.6084/m9.figshare.28978805}, url = {https://github.com/thieu1995/MetaSklearn} }

Official Links

  • Official source code repo: https://github.com/thieu1995/MetaSklearn
  • Official document: https://metasklearn.readthedocs.io/
  • Download releases: https://pypi.org/project/metasklearn/
  • Issue tracker: https://github.com/thieu1995/MetaSklearn/issues
  • Notable changes log: https://github.com/thieu1995/MetaSklearn/blob/master/ChangeLog.md
  • Official chat group: https://t.me/+fRVCJGuGJg1mNDg1

Developed by: Thieu @ 2025

Owner

  • Name: Nguyen Van Thieu
  • Login: thieu1995
  • Kind: user
  • Location: Earth
  • Company: AIIR Group

Knowledge is power, sharing it is the premise of progress in life. It seems like a burden to someone, but it is the only way to achieve immortality.

Citation (CITATION.cff)

cff-version: 1.1.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: "Van Thieu"
    given-names: "Nguyen"
    orcid: "https://orcid.org/0000-0001-9994-8747"
title: "MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models"
version: v0.3.0
doi: 10.6084/m9.figshare.28978805
date-released: 2025-06-04
url: "https://github.com/thieu1995/MetaSklearn"

GitHub Events

Total
  • Release event: 6
  • Watch event: 5
  • Delete event: 3
  • Member event: 3
  • Push event: 26
  • Create event: 8
Last Year
  • Release event: 6
  • Watch event: 5
  • Delete event: 3
  • Member event: 3
  • Push event: 26
  • Create event: 8

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 74
  • Total Committers: 2
  • Avg Commits per committer: 37.0
  • Development Distribution Score (DDS): 0.486
Past Year
  • Commits: 74
  • Committers: 2
  • Avg Commits per committer: 37.0
  • Development Distribution Score (DDS): 0.486
Top Committers
Name Email Commits
Thieu Nguyen n****2@g****m 38
anh9895 h****5@g****m 36

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 37 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 3
  • Total maintainers: 1
pypi.org: metasklearn

MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 37 Last month
Rankings
Dependent packages count: 9.2%
Average: 30.4%
Dependent repos count: 51.7%
Maintainers (1)
Last synced: 4 months ago