auto-sklong

☂️ Auto-Scikit-Longitudinal (Auto-Sklong) is an automated machine learning (AutoML) library designed to analyse longitudinal data (Classification tasks focussed as of today) using various search methods. Namely, Bayesian Optimisation via SMAC3, Asynchronous Successive Halving, Evolutionary Algorithms, and Random Search via GAMA

https://github.com/simonprovost/auto-sklong

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: ieee.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.9%) to scientific vocabulary

Keywords

automl classification longitudinal machine-learning repeated-measurements scikit sklong supervised-learning
Last synced: 6 months ago · JSON representation

Repository

☂️ Auto-Scikit-Longitudinal (Auto-Sklong) is an automated machine learning (AutoML) library designed to analyse longitudinal data (Classification tasks focussed as of today) using various search methods. Namely, Bayesian Optimisation via SMAC3, Asynchronous Successive Halving, Evolutionary Algorithms, and Random Search via GAMA

Basic Info
Statistics
  • Stars: 36
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 4
Topics
automl classification longitudinal machine-learning repeated-measurements scikit sklong supervised-learning
Created about 2 years ago · Last pushed 7 months ago
Metadata Files
Readme License Code of conduct Citation

README.md


Auto-Sklong
Auto-Sklong

An Automated Machine Learning library for longitudinal classification built on GAMA and Scikit-longitudinal

pytest pylint pre--commit black Jupyter RUFF compliant UV compliant Codecov Fork GAMA Python 3.9+ < 3.10

💡 About The Project

Auto-Scikit-Longitudinal (Auto-Sklong) is an Automated Machine Learning (AutoML) library, developed upon the General Machine Learning Assistant (GAMA) framework, introducing a brand-new search space leveraging both Scikit-Longitudinal and Scikit-learn models to tackle the Longitudinal machine learning classification tasks.

Wait, what is Longitudinal Data — In layman's terms ?

Longitudinal data is a "time-lapse" snapshot of the same subject, entity, or group tracked over time-periods, similar to checking in on patients to see how they change. For instance, doctors may monitor a patient's blood pressure, weight, and cholesterol every year for a decade to identify health trends or risk factors. This data is more useful for predicting future results than a one-time survey because it captures evolution, patterns, and cause-effect throughout time.

Not enough?


🛠️ Installation

[!NOTE] Want to use Jupyter Notebook, Marimo, Google Colab, or JupyterLab? Head to the Getting Started section of the documentation for full instructions! 🎉

To install Auto-Sklong:

  1. ✅ Install the latest version: bash pip install auto-sklong

To install a specific version: bash pip install auto-sklong==0.0.1

[!CAUTION] Auto-Sklong is currently compatible with Python versions 3.9 only. Ensure you have this version installed before proceeding.

This limitation stems from the Deep Forest dependency. Follow updates on this GitHub issue.

If you encounter errors, explore the installation section in the Getting Started of the documentation. If issues persist, open a GitHub issue.


🚀 Getting Started

Here's how to run AutoML on longitudinal data with Auto-Sklong:

```python from sklearn.metrics import classificationreport from scikitlongitudinal.data_preparation import LongitudinalDataset from gama.GamaLongitudinalClassifier import GamaLongitudinalClassifier

Load your dataset (replace 'stroke.csv' with your actual dataset path)

dataset = LongitudinalDataset('./stroke.csv')

Set up the target column and split the data (replace 'classstrokewave_4' with your target)

dataset.loaddatatargettraintestsplit( targetcolumn="classstrokewave_4", )

Set up feature groups (temporal dependencies)

Use a pre-set for ELSA data or define manually (See docs for details)

dataset.setupfeaturesgroup(input_data="elsa")

Initialise the AutoML system

automl = GamaLongitudinalClassifier( featuresgroup=dataset.featuregroups(), nonlongitudinalfeatures=dataset.nonlongitudinalfeatures(), featurelistnames=dataset.data.columns.tolist(), maxtotaltime=3600 # Adjust time as needed (in seconds) )

Fit the AutoML system

automl.fit(dataset.Xtrain, dataset.ytrain)

Make predictions

ypred = automl.predict(dataset.Xtest)

Print the classification report

print(classificationreport(dataset.ytest, y_pred)) ```

More detailed examples and tutorials can be found in the documentation!


📝 How to Cite

If you use Auto-Sklong in your research, please cite our paper:

bibtex @INPROCEEDINGS{10821737, author={Provost, Simon and Freitas, Alex A.}, booktitle={2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)}, title={Auto-Sklong: A New AutoML System for Longitudinal Classification}, year={2024}, volume={}, number={}, pages={2021-2028}, keywords={Pipelines;Optimization;Predictive models;Classification algorithms;Conferences;Bioinformatics;Biomedical computing;Automated Machine Learning;AutoML;Longitudinal Classification;Scikit-Longitudinal;GAMA}, doi={10.1109/BIBM62325.2024.10821737}}

🚀 What's New Compared to GAMA?

We enhanced @PGijsbers' open-source GAMA initiative by introducing a brand-new search space designed specifically for tackling longitudinal classification problems. This search space is powered by our custom library, Scikit-Longitudinal (Sklong), enabling Combined Algorithm Selection and Hyperparameter Optimization (CASH Optimization).

Unlike GAMA or other existing AutoML libraries, Auto-Sklong offers out-of-the-box support for longitudinal classification tasks—a capability not previously available.

Search Space Viz.:

To better understand our proposed search space, refer to the visualisation below (read from left to right, each step being one new component to a final pipeline candidate configuration):

Search Space Visualization

While GAMA offers some configurability for search spaces, we improved its functionality to better suit our needs. You can find the details of our contributions in the following pull requests: - ConfigSpace Technology Integration for Enhanced GAMA Configuration and Management 🥇 - Search Methods Enhancements to Avoid Duplicate Evaluated Pipelines 🥈 - SMAC3 Bayesian Optimisation Integration 🆕

🔐 License

Auto-Sklong is licensed under the MIT License.

Owner

  • Name: Provost Simon
  • Login: simonprovost
  • Kind: user
  • Location: London
  • Company: @UniversityOfKent

Incoming Visiting Researcher @ NYU | TANDON 🇺🇸 –– Ph.D student @ University of Kent 🎓

GitHub Events

Total
  • Create event: 5
  • Release event: 3
  • Issues event: 3
  • Watch event: 22
  • Delete event: 3
  • Issue comment event: 8
  • Push event: 37
  • Pull request event: 2
Last Year
  • Create event: 5
  • Release event: 3
  • Issues event: 3
  • Watch event: 22
  • Delete event: 3
  • Issue comment event: 8
  • Push event: 37
  • Pull request event: 2

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 3
  • Total pull requests: 4
  • Average time to close issues: 3 months
  • Average time to close pull requests: 1 day
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 4.33
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 3
  • Pull requests: 4
  • Average time to close issues: 3 months
  • Average time to close pull requests: 1 day
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 4.33
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • anderdnavarro (3)
Pull Request Authors
  • simonprovost (6)
Top Labels
Issue Labels
good first issue (2) documentation (1)
Pull Request Labels
documentation (3) migration (2) enhancement (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 100 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 9
  • Total maintainers: 1
pypi.org: auto-sklong

A package for automated machine learning based on scikit-learn and sklong to tackle the longitudinal machine learning classification tasks.

  • Versions: 9
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 100 Last month
Rankings
Forks count: 8.7%
Stargazers count: 8.9%
Dependent packages count: 10.6%
Average: 22.1%
Dependent repos count: 60.0%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/documentation-deploy.yaml actions
  • actions/cache v4 composite
  • actions/checkout v4 composite
  • actions/setup-python v5 composite
.github/workflows/precommit.yaml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • pre-commit/action v3.0.0 composite
.github/workflows/publish-pypi.yaml actions
  • actions/checkout v4 composite
  • actions/checkout v2 composite
  • actions/checkout v3 composite
  • actions/setup-python v2 composite
  • pdm-project/setup-pdm main composite
  • pdm-project/setup-pdm v3 composite
.github/workflows/pytest.yaml actions
  • actions/checkout v3 composite
  • actions/setup-python v2 composite
  • pdm-project/setup-pdm main composite
.github/workflows/test-with-pre.yaml actions
  • actions/checkout v3 composite
  • actions/setup-python v2 composite
  • pdm-project/setup-pdm main composite
pyproject.toml pypi
uv.lock pypi
  • 194 dependencies