Scikit-Longitudinal

Scikit-Longitudinal: A Machine Learning Library for Longitudinal Classification in Python - Published in JOSS (2025)

https://github.com/simonprovost/scikit-longitudinal

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 7 DOI reference(s) in README and JOSS metadata
✓
Academic publication links
Links to: joss.theoj.org
○
Committers with academic emails
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Keywords

classification longitudinal longitudinal-classification longitudinal-data longitudinal-studies machine-learning repeated-measurements scikit-learn supervised-learning

Scientific Fields

Engineering Computer Science - 80% confidence

Last synced: 6 months ago · JSON representation

Repository

☂️ Scikit-longitudinal (Sklong) is an open-source Python library & Scikit-Learn API compliant, tailored to longitudinal machine learning classification tasks. It is ideal for researchers, data scientists, and analysts, as it provides specialist tools for dealing with repeated-measures data challenges

Basic Info

Host: GitHub
Owner: simonprovost
License: mit
Language: Python
Default Branch: main
Homepage: https://scikit-longitudinal.readthedocs.io/latest/
Size: 69.6 MB

Statistics

Stars: 66
Watchers: 4
Forks: 3
Open Issues: 3
Releases: 6

Topics

classification longitudinal longitudinal-classification longitudinal-data longitudinal-studies machine-learning repeated-measurements scikit-learn supervised-learning

Created almost 3 years ago · Last pushed 6 months ago

Metadata Files

Readme Changelog License Citation

README.md

Scikit-longitudinal

A specialised Python library for longitudinal data analysis built on Scikit-learn

💡 About The Project

Scikit-longitudinal (Sklong) is a machine learning library designed to analyse longitudinal data (Classification tasks focussed as of today). It offers tools and models for processing, analysing, and predicting longitudinal data, with a user-friendly interface that integrates with the Scikit-learn ecosystem.

Wait, what is Longitudinal Data — In layman's terms ?

Longitudinal data is a "time-lapse" snapshot of the same subject, entity, or group tracked over time-periods, similar to checking in on patients to see how they change. For instance, doctors may monitor a patient's blood pressure, weight, and cholesterol every year for a decade to identify health trends or risk factors. This data is more useful for predicting future results than a one-time survey because it captures evolution, patterns, and cause-effect throughout time.

Not enough?

For more scientific details, you can refer to our paper published in the Journal of Open Source Software (JOSS).
For more technical details, visit the official documentation.

🛠️ Installation

[!NOTE] Want to be using Jupyter Notebook, Marimo, Google Colab, or JupyterLab? Head to the Getting Started section of the documentation, we explain it all! 🎉

To install Scikit-longitudinal:

✅ Install the latest version: bash pip install Scikit-longitudinal

To install a specific version: bash pip install Scikit-longitudinal==0.1.0

[!CAUTION] Scikit-longitudinal is currently compatible with Python versions 3.9 only. Ensure you have one of these versions installed before proceeding with the installation.

Now, while we understand that this is a limitation, we are tied for the time being because of Deep Forest. Deep Forest is a dependency of Scikit-longitudinal that is not compatible with Python versions greater than 3.9. Deep Forest helps us with the Deep Forest algorithm, to which we have made some modifications to welcome Lexicographical Deep Forest.

To follow up on this discussion, please refer to this github issue.

If you encounter any errors, feel free to explore further the installation section in the Getting Started of the documentation. If it still doesn't work, please open an issue on GitHub.

🚀 Getting Started

Here's how to analyse longitudinal data with Scikit-longitudinal:

``` py from scikitlongitudinal.datapreparation import LongitudinalDataset from scikitlongitudinal.estimators.ensemble.lexicographical.lexicogradient_boosting import LexicoGradientBoostingClassifier

dataset = LongitudinalDataset('./stroke.csv') # Note this is a fictional dataset. Use yours! dataset.loaddatatargettraintestsplit( targetcolumn="classstrokewave_4", )

Pre-set or manually set your temporal dependencies

dataset.setupfeaturesgroup(input_data="elsa")

model = LexicoGradientBoostingClassifier( featuresgroup=dataset.featuregroups(), threshold_gain=0.00015 # Refer to the API for more hyper-parameters and their meaning )

model.fit(dataset.Xtrain, dataset.ytrain) ypred = model.predict(dataset.Xtest)

Classification report

print(classificationreport(ytest, y_pred)) ```

📝 How to Cite

If you use Sklong in your research, please cite our paper:

bibtex @article{Provost2025, doi = {10.21105/joss.08481}, url = {https://doi.org/10.21105/joss.08481}, year = {2025}, publisher = {The Open Journal}, volume = {10}, number = {112}, pages = {8481}, author = {Provost, Simon and Freitas, Alex A.}, title = {Scikit-Longitudinal: A Machine Learning Library for Longitudinal Classification in Python}, journal = {Journal of Open Source Software} }

🔐 License

Scikit-longitudinal is licensed under the MIT License.

Owner

Name: Provost Simon
Login: simonprovost
Kind: user
Location: London
Company: @UniversityOfKent

Website: https://bento.me/simon-provost
Repositories: 26
Profile: https://github.com/simonprovost

Incoming Visiting Researcher @ NYU | TANDON 🇺🇸 –– Ph.D student @ University of Kent 🎓

JOSS Publication

Scikit-Longitudinal: A Machine Learning Library for Longitudinal Classification in Python

Published

August 01, 2025

DOI

10.21105/joss.08481

Volume 10, Issue 112, Page 8481

Authors

Simon Provost

School of Computing, University of Kent, Canterbury, United Kingdom

Alex A. Freitas

School of Computing, University of Kent, Canterbury, United Kingdom

Editor

Evan Spotte-Smith

GitHub Events

Total

Create event: 7
Commit comment event: 2
Issues event: 19
Release event: 2
Watch event: 35
Delete event: 7
Issue comment event: 43
Push event: 50
Pull request event: 13
Fork event: 2

Last Year

Create event: 7
Commit comment event: 2
Issues event: 19
Release event: 2
Watch event: 35
Delete event: 7
Issue comment event: 43
Push event: 50
Pull request event: 13
Fork event: 2

Committers

Last synced: 8 months ago

All Time

Total Commits: 174
Total Committers: 2
Avg Commits per committer: 87.0
Development Distribution Score (DDS): 0.098

Past Year

Commits: 52
Committers: 1
Avg Commits per committer: 52.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Provost Simon	s**t@e**u	157
sgp28	s**v@g**n	17

Committer Domains (Top 20 + Academic)

gmail.con: 1 epitech.eu: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 27
Total pull requests: 40
Average time to close issues: 23 days
Average time to close pull requests: about 11 hours
Total issue authors: 7
Total pull request authors: 1
Average comments per issue: 2.0
Average comments per pull request: 0.1
Merged pull requests: 36
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 13
Pull requests: 10
Average time to close issues: 4 days
Average time to close pull requests: 43 minutes
Issue authors: 7
Pull request authors: 1
Average comments per issue: 3.62
Average comments per pull request: 0.2
Merged pull requests: 8
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

simonprovost (15)
blengerich (4)
TahiriNadia (3)
klemenpe (2)
orduek (1)
edqzhang (1)
SteffKL (1)

Pull Request Authors

simonprovost (40)

Top Labels

Issue Labels

Classification (6) Preprocessing (3) enhancement (2) Preparation (2) documentation (1) help wanted (1)

Pull Request Labels

documentation (8) enhancement (8) Classification (4) bug (4) Preprocessing (3) Preparation (3) tests (3) migration (2)

Packages

Total packages: 2
Total downloads:
- pypi 330 last-month

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 0
(may contain duplicates)
Total versions: 16
Total maintainers: 1

pypi.org: scikit-longitudinal

Scikit-longitudinal is an open-source Python library for longitudinal data analysis, building on Scikit-learn's foundation with tools tailored for repeated measures data.

Homepage: https://github.com/simonprovost/scikit-longitudinal
Documentation: https://scikit-longitudinal.readthedocs.io/latest/
License: MIT
Latest release: 0.1.1
published 6 months ago

Versions: 12
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 215 Last month

Rankings

Dependent packages count: 10.7%

Average: 35.5%

Dependent repos count: 60.3%

Maintainers (1)

SimonProvost

Last synced: 6 months ago

pypi.org: scikit-lexicographical-trees

A set of python modules for machine learning and data mining

Homepage: https://simonprovost.github.io/scikit-longitudinal/
Documentation: https://scikit-lexicographical-trees.readthedocs.io/
License: new BSD
Latest release: 0.0.4
published over 1 year ago

Versions: 4
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 115 Last month

Rankings

Dependent packages count: 10.7%

Average: 35.5%

Dependent repos count: 60.3%

Maintainers (1)

SimonProvost

Last synced: 6 months ago

Scikit-Longitudinal

Science Score: 93.0%

Keywords

Scientific Fields

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Scikit-longitudinal

A specialised Python library for longitudinal data analysis built on Scikit-learn

💡 About The Project

🛠️ Installation

🚀 Getting Started

Pre-set or manually set your temporal dependencies

Classification report

📝 How to Cite

🔐 License

Owner

JOSS Publication

Scikit-Longitudinal: A Machine Learning Library for Longitudinal Classification in Python

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: scikit-longitudinal

Rankings

Maintainers (1)

pypi.org: scikit-lexicographical-trees

Rankings

Maintainers (1)